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Preface 


Many different mathematical methods and concepts are used in classical 
mechanics: differential equations and phase flows, smooth mappings and 
manifolds, Lie groups and Lie algebras, symplectic geometry and ergodic 
theory. Many modern mathematical theories arose from problems in 
mechanics and only later acquired that axiomatic-abstract form which 
makes them so hard to study. 

In this book we construct the mathematical apparatus of classical 
mechanics from the very beginning; thus, the reader is not assumed to have 
any previous knowledge beyond standard courses in analysis (differential 
and integral calculus, differential equations), geometry (vector spaces, 
vectors) and linear algebra (linear operators, quadratic forms). 

With the help of this apparatus, we examine all the basic problems in 
dynamics, including the theory of oscillations, the theory of rigid body 
motion, and the hamiltonian formalism. The author has tried to show the 
geometric, qualitative aspect of phenomena. In this respect the book is 
closer to courses in theoretical mechanics for theoretical physicists than to 
traditional courses in theoretical mechanics as taught by mathematicians. 

A considerable part of the book is devoted to variational principles and 
analytical dynamics. Characterizing analytical dynamics in his ‘‘ Lectures on 
the development of mathematics in the nineteenth century,” F. Klein wrote 
that ‘“*. . . a physicist, for his problems, can extract from these theories only 
very little, and an engineer nothing.”” The development of the sciences in the 
following years decisively disproved this remark. Hamiltonian formalism 
lay at the basis of quantum mechanics and has become one of the most often 
used tools in the mathematical arsenal of physics. After the significance of 
symplectic structures and Huygens’ principle for all sorts of optimization 
problems was realized, Hamilton’s equations began to be used constantly in 
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engineering calculations. On the other hand, the contemporary development 
of celestial mechanics, connected with the requirements of space exploration, 
created new interest in the methods and problems of analytical dynamics. 

The connections between classical mechanics and other areas of mathe- 
matics and physics are many and varied. The appendices to this book are 
devoted to a few of these connections. The apparatus of classical mechanics 
is applied to: the foundations of riemannian geometry, the dynamics of 
an ideal fluid, Kolmogorov’s theory of perturbations of conditionally 
periodic motion, short-wave asymptotics for equations of mathematical 
physics, and the classification of caustics in geometrical optics. 

These appendices are intended for the interested reader and are not part 
of the required general course. Some of them could constitute the basis of 
special courses (for example, on asymptotic methods in the theory of non- 
linear oscillations or on quasi-classical asymptotics). The appendices also 
contain some information of a reference nature (for example, a list of normal 
forms of quadratic hamiltonians). While in the basic chapters of the book the 
author has tried to develop all the proofs as explicitly as possible, avoiding 
references to other sources, the appendices consist on the whole of summaries 
of results, the proofs of which are to be found in the cited literature. 

The basis for the book was a year-and-a-half-long required course 
in classical mechanics, taught by the author to third- and fourth-year 
mathematics students at the mathematics-mechanics faculty of Moscow 
State University in 1966-1968. 

The author is grateful to I. G. Petrovsky, who insisted that these lectures 
be delivered, written up, and published. In preparing these lectures for 
publication, the author found very helpful the lecture notes of L. A. Buni- 
movich, L. D. Vaingortin, V. L. Novikov, and especially, the mimeographed 
edition (Moscow State University, 1968) organized by N. N. Kolesnikov. The 
author thanks them, and also all the students and colleagues who communi- 
cated their remarks on the mimeographed text; many of these remarks were 
used in the preparation of the present edition. The author is grateful to 
M. A. Leontovich, for suggesting the treatment of connections by means of a 
limit process, and also to I. I. Vorovich and V. I. Yudovich for their detailed 
review of the manuscript. 


V. ARNOLD 


The translators would like to thank Dr. R. Barrar for his help in reading 
the proofs. We would also like to thank many readers, especially Ted Courant, 
for spotting errors in the first two printings. 


Berkeley, 1981 K. VOGTMANN 
A. WEINSTEIN 
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Preface to the second edition 


The main part of this book was written twenty years ago. The ideas and 
methods of symplectic geometry, developed in this book, have now found 
many applications in mathematical physics and in other domains of applied 
mathematics, as well as in pure mathematics itself. Especially, the theory of 
short wave asymptotic expansions has reached a very sophisticated level, with 
many important applications to optics, wave theory, acoustics, spectroscopy, 
and even chemistry; this development was parallel to the development of the 
theories of Lagrange and Legendre singularities, that is, of singularities of 
caustics and of wave fronts, of their topology and their perestroikas (in 
Russian metamorphoses were always called “perestroikas,” as in “Morse 
perestroika” for the English “Morse surgery”; now that the word perestroika 
has become international, we may preserve the Russian term in translation 
and are not obliged to substitute “metamorphoses” for “perestroikas” when 
speaking of wave fronts, caustics, and so on). 

Integrable hamiltonian systems have been discovered unexpectedly in many 
classical problems of mathematical physics, and their study has led to new 
results in both physics and mathematics, for instance, in algebraic geometry. 

Symplectic topology has become one of the most promising and active 
branches of “global analysis.” An important generalization of the Poincaré 
“geometric theorem” (see Appendix 9) was proved by C. Conley and 
E. Zehnder in 1983. A sequence of works (by M. Chaperon, A. Weinstein, J.-C. 
Sikorav, M. Gromoy, Ya. M. Eliashberg, Yu. Chekanov, A. Floer, C. Viterbo, 
H. Hofer, and others) marks important progress in this very lively domain. 
One may hope that this progress will lead to the proof of many known 
conjectures in symplectic and contact topology, and to the discovery of new 
results in this new domain of mathematics, emerging from the problems of 
mechanics and optics. 
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The present edition includes three new appendices. They represent the 
modern development of the theory of ray systems (the theory of singularity 
and of perestroikas of caustics and of wave fronts, related to the theory of 
Coxeter reflection groups), the theory of integrable systems (the geometric 
theory of elliptic coordinates, adapted to the infinite-dimensional Hilbert 
space generalization), and the theory of Poisson structures (which is a general- 
ization of the theory of symplectic structures, including degenerate Poisson 
brackets). 

A more detailed account of the present state of perturbation theory may be 
found in the book, Mathematical Aspects of Classical and Celestial Mechanics 
by V.I. Arnold, V. V. Kozlov, and A. I. Neistadt, Encyclopaedia of Math. Sci., 
Vol. 3 (Springer, 1986); Volume 4 of this series (1988) contains a survey 
“Symplectic geometry” by V. I. Arnold and A. B. Givental’, an article by 
A. A. Kirillov on geometric quantization, and a survey of the modern theory 
of integrable systems by S. P. Novikov, I. M. Krichever, and B. A. Dubrovin. 

For more details on the geometry of ray systems, see the book Singularities 
of Differentiable Mappings by V. I. Arnold, S. M. Gusein-Zade, and A. N. 
Varchenko (Vol. 1, Birkhauser, 1985; Vol. 2, Birkhauser, 1988). Catastrophe 
Theory by V. I. Arnold (Springer, 1986) (second edition) contains a long 
annotated bibliography. 

Surveys on symplectic and contact geometry and on their applications may 
be found in the Bourbaki seminar (D. Bennequin, “Caustiques mystiques”, 
February, 1986) and ina series of articles (V. I. Arnold, First steps in symplectic 
topology, Russian Math. Surveys, 41 (1986); Singularities of ray systems, 
Russian Math. Surveys, 38 (1983); Singularities in variational calculus, 
Modern Problems of Math., VINITI, 22 (1983) (translated in J. Soviet Math.); 
and O. P. Shcherbak, Wave fronts and reflection groups, Russian Math. 
Surveys, 43 (1988)). 

Volumes 22 (1983) and 33 (1988) of the VINITI series, “Sovremennye 
problemy matematiki. Noveishie dostijenia,” contain a dozen articles on the 
applications of symplectic and contact geometry and singularity theory to 
mathematics and physics. 

Bifurcation theory (both for hamiltonian and for more general systems) 
is discussed in the textbook Geometrical Methods in the Theory of Ordinary 
Differential Equations (Springer, 1988) (this new edition is more complete than 
the preceding one). The survey “Bifurcation theory and its applications in 
mathematics and mechanics” (XVIIth International Congress of Theoretical 
and Applied Mechanics in Grenoble, August, 1988) also contains new infor- 
mation, as does Volume 5 of the Encyclopaedia of Math. Sci. (Springer, 1989), 
containing the survey “Bifurcation theory” by V. I. Arnold, V. S. Afraimovich, 
Yu. S. Ilyashenko, and L. P. Shilnikov. Volume 2 of this series, edited by 
D. V. Anosov and Ya. G. Sinai, is devoted to the ergodic theory of dynamical 
systems including those of mechanics. 

The new discoveries in all these theories have potentially extremely wide 
applications, but since these results were discovered rather recently, they are 
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discussed only in the specialized editions, and applications are impeded by 
the difficulty of the mathematical exposition for nonmathematicians. I hope 
that the present book will help to master these new theories not only to 
mathematicians, but also to all those readers who use the theory of dynamical 
systems, symplectic geometry, and the calculus of variations—in physics, 
mechanics, control theory, and so on. The author would like to thank Dr. 
T. Tokieda for his help in correcting errors in previous printings and for 
reading the proofs. 


December 1988 V.I. Arnold 


Translator’s preface to the second edition 


This edition contains three new appendices, originally written for inclusion in 
a German edition. They describe work by the author and his co-workers on 
Poisson structures, elliptic coordinates with applications to integrable sys- 
tems, and singularities of ray systems. In addition, numerous corrections to 
errors found by the author, the translators, and readers have been incorpo- 
rated into the text. 
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PART I 
NEWTONIAN MECHANICS 


Newtonian mechanics studies the motion of a system of point masses 
in three-dimensional euclidean space. The basic ideas and theorems of 
newtonian mechanics (even when formulated in terms of three-dimensional 
cartesian coordinates) are invariant with respect to the six-dimensional’ 
group of euclidean motions of this space. 

A newtonian potential mechanical system is specified by the masses 
of the points and by the potential energy. The motions of space which leave 
the potential energy invariant correspond to laws of conservation. 

Newton’s equations allow one to solve completely a series of important 
problems in mechanics, including the problem of motion in a central force 


field. 


! And also with respect to the larger group of galilean transformations of space-time. 


Experimental facts 


In this chapter we write down the basic experimental facts which lie at the 
foundation of mechanics: Galileo’s principle of relativity and Newton’s 
differential equation. We examine constraints on the equation of motion 
imposed by the relativity principle, and we mention some simple examples. 


1 The principles of relativity and determinacy 


In this paragraph we introduce and discuss the notion of an inertial coordinate system. The 
mathematical statements of this paragraph are formulated exactly in the next paragraph. 


A series of experimental facts is at the basis of classical mechanics.? We 
list some of them. 


A Space and time 


Our space is three-dimensional and euclidean, and time is one-dimensional. 


B Galileo’s principle of relativity 


There exist coordinate systems (called inertial) possessing the following 
two properties: 


1. All the laws of nature at all moments of time are the same in all inertial 
coordinate systems. 

2. All coordinate systems in uniform rectilinear motion with respect to an 
inertial one are themselves inertial. 


? All these “experimental facts” are only approximately true and can be refuted by more exact 
experiments. In order to avoid cumbersome expressions, we will not specify this from now on 
and we will speak of our mathematical models as if they exactly described physical phenomena. 
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1: Experimental facts 


In other words, if a coordinate system attached to the earth is inertial, 
then an experimenter on a train which is moving uniformly in a straight line 
with respect to the earth cannot detect the motion of the train by experiments 
conducted entirely inside his car. 

In reality, the coordinate system associated with the earth is only approxi- 
mately inertial. Coordinate systems associated with the sun, the stars, etc. 
are more nearly inertial. 


C Newton's principle of determinacy 


The initial state of a mechanical system (the totality of positions and 
velocities of its points at some moment of time) uniquely determines all of 
its motion. 

It is hard to doubt this fact, since we learn it very early. One can imagine 
a world in which to determine the future of a system one must also know the 
acceleration at the initial moment, but experience shows us that our world 
is not like this. 


2 The galilean group and Newton’s equations 


In this paragraph we define and investigate the galilean group of space-time transformations. 
Then we consider Newton’s equation and the simplest constraints imposed on its right-hand side 
by the property of invariance with respect to galilean transformations.* 


A Notation 


We denote the set of all real numbers by R. We denote by R” an n-dimen- 
sional real vector space. 


a+b 


Figure 1 Parallel displacement 


Affine n-dimensional space A" is distinguished from R" in that there is 
“no fixed origin.” The group R" acts on A" as the group of parallel displace- 
ments (Figure 1): 


a>a+b, ace A" be R a+ be A". 


[Thus the sum of two points of A” is not defined, but their difference is defined 
and is a vector in R".] 


3 The reader who has no need for the mathematical formulation of the assertions of Section 1 
can omit this section. 
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2: The galiliean group and Newton’s equations 


A euclidean structure on the vector space R’ is a positive definite symmetric 
bilinear form called a scalar product. The scalar product enables one to 
define the distance 


p(x, y) = |x — yl = /(« — y,x — y) 


between points of the corresponding affine space A”. An affine space with this 
distance function is called a euclidean space and is denoted by E”. 


B Galilean structure 
The galilean space-time structure consists of the following three elements: 


1. The universe—a four-dimensional affine* space A*. The points of A* 
are called world points or events. The parallel displacements of the universe 
A* constitute a vector space R*. 

2. Time—a linear mapping t:R*— R from the vector space of parallel 
displacements of the universe to the real “time axis.” The time interval 
from event ae A* to event be A* is the number t(b — a) (Figure 2). If 
t(b — a) = 0, then the events a and b are called simultaneous. 


Figure 2 Interval of time t 


The set of events simultaneous with a given event forms a three- 
dimensional affine subspace in A*. It is called a space of simultaneous 
events A®. 

The kernel of the mapping t consists of those parallel displacements of 
A* which take some (and therefore every) event into an event simultaneous 
with it. This kernel is a three-dimensional linear subspace R? of the vector 
space R*. 


The galilean structure includes one further element. 
3. The distance between simultaneous events 
pla, b) = |la— bl =,/(a—b,a—b) abe A 


is given by a scalar product on the space R*. This distance makes every 
space of simultaneous events into a three-dimensional euclidean space E°. 


4 Formerly, the universe was provided not with an affine, but with a linear structure (the geo- 
centric system of the universe). 


1: Experimental facts 


A space A*, equipped with a galilean space-time structure, is called a 
galilean space. 

One can speak of two events occurring simultaneously in different places, 
but the expression “two non-simultaneous events a,be¢A* occurring at 
one and the same place in three-dimensional space” has no meaning as long 
as we have not chosen a coordinate system. 

The galilean group is the group of all transformations of a galilean space 
which preserve its structure. The elements of this group are called galilean 
transformations. Thus, galilean transformations are affine transformations 
of A* which preserve intervals of time and the distance between simultaneous 
events. 


EXAMPLE. Consider the direct product> R x R? of the t axis with a three- 
dimensional vector space R?; suppose R? has a fixed euclidean structure. 
Such a space has a natural galilean structure. We will call this space galilean 
coordinate space. 

We mention three examples of galilean transformations of this space. 
First, uniform motion with velocity v: 


gi(t,x)=(t,x+vt) VreR xeR?. 
Next, translation of the origin: 
g(t,x)=(tt+s,x+s) VreRxeR?’. 
Finally, rotation of the coordinate axes: 
g3(t, x) = (t, Gx), Vre R, xe R?, 


where G: R? > R? is an orthogonal transformation. 


PROBLEM. Show that every galilean transformation of the space R x R? 
can be written in a unique way as the composition of a rotation, a translation, 
and a uniform motion (g = g; °g2°g3) (thus the dimension of the galilean 
group is equal to 3 + 4 + 3 = 10). 


PROBLEM. Show that all galilean spaces are isomorphic to each other® 
and, in particular, isomorphic to the coordinate space R x R?>. 


Let M be a set. A one-to-one correspondence 9,: M > R x R? is called 
a galilean coordinate system on the set M. A coordinate system @, moves 
uniformly with respect to 9, if g,-@7':R x R?>R x R? is a galilean 
transformation. The galilean coordinate systems @, and ~, give M the same 
galilean structure. 


5 Recall that the direct product of two sets A and B is the set of ordered pairs (a, b), where 
aé Aand be B. The direct product of two spaces (vector, affine, euclidean) has the structure of a 
space of the same type. 


© That is, there is a one-to-one mapping of one to the other preserving the galilean structure. 
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2: The galilean group and Newton’s equations 


C Motion, velocity, acceleration 
A motion in R’ is a differentiable mapping x: I > R™, where J is an interval 
on the real axis. 
The derivative 
; dx 
X(t9) = — 
(to) = 5 
is called the velocity vector at the point tp € I. 
The second derivative 


or ie X(to + h) — x(t) 


e RN 
t=to h>0 h 


d*x 


ae 


t=to 


is called the acceleration vector at the point to. 

We will assume that the functions we encounter are continuously differ- 
entiable as many times as necessary. In the future, unless otherwise stated, 
mappings, functions, etc. are understood to be differentiable mappings, 
functions, etc. The image of a mapping x: J > RN is called a trajectory or 
curve in RY. 


PRoBLEM. Is it possible for the trajectory of a differentiable motion on the 
plane to have the shape drawn in Figure 3? Is it possible for the acceleration 
vector to have the value shown? 


ANSWER. Yes. No. 


Figure 3. Trajectory of motion of a point 


We now define a mechanical system of n points moving in three-dimensional 
euclidean space. 

Let x: R > R? be a motion in R?. The graph’ of this mapping is a curve 
in R x R?. 

A curve in galilean space which appears in some (and therefore every) 
galilean coordinate system as the graph of a motion, is called a world line 
(Figure 4). 


7 The graph of a mapping f: A — B is the subset of the direct product A x B consisting of all 
pairs (a, f(a)) with ae A. 


1: Experimental facts 


™—Uu_ 


—__—_» R 


Figure 4 World lines 


A motion of a system of n points gives, in galilean space, n world lines. 
In a galilean coordinate system they are described by n mappings x;: R > R°, 
i=1,...,n. 

The direct product of n copies of R® is called the configuration space 
of the system of n points. Our n mappings x;: R > R* define one mapping 


x:R— RY N =3n 


of the time axis into the configuration space. Such a mapping is also called 
a motion of a system of n points in the galilean coordinate system on R x R?. 


D Newton’s equations 


According to Newton’s principle of determinacy (Section 1C) all motions 
of a system are uniquely determined by their initial positions (x(tg) € R%) 
and initial velocities (X(ty) € RY). 

In particular, the initial positions and velocities determine the acceleration. 
In other words, there is a function F: R® x R” x R— R¥ such that 


(1) & = F(x, x 2). 


Newton used Equation (1) as the basis of mechanics. It is called Newton’s 
equation. 

By the theorem of existence and uniqueness of solutions to ordinary 
differential equations, the function F and the initial conditions x(t 9) and 
X(t 9) uniquely determine a motion.® 

For each specific mechanical system the form of the function F is deter- 
mined experimentally. From the mathematical point of view the form of F 
for each system constitutes the definition of that system. 


E Constraints imposed by the principle of relativity 


Galileo’s principle of relativity states that in physical space-time there is a 
selected galilean structure (“the class of inertial coordinate systems”) 
having the following property. 


8 Under certain smoothness conditions, which we assume to be fulfilled. In general, a motion 
is determined by Equation (1) only on some interval of the time axis. For simplicity we will 
assume that this interval is the whole time axis, as is the case in most problems in mechanics. 
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2: The galilean group and Newton’s equations 


csasliee- 


t t 


Figure 5 Galileo’s principle of relativity 


If we subject the world lines of all the points of any mechanical system® 
to one and the same galilean transformation, we obtain world lines of the 
same system (with new initial conditions) (Figure 5). 

This imposes a series of conditions on the form of the right-hand side of 
Newton’s equation written in an inertial coordinate system: Equation (1) 
must be invariant with respect to the group of galilean transformations. 


EXAMPLE 1. Among the galilean transformations are the time translations. 
Invariance with respect to time translations means that “the laws of nature 
remain constant,” ie., if x = @(t) is a solution to Equation (1), then for any 
séR,x = Q(t + s) is also a solution. 

From this it follows that the right-hand side of Equation (1) in an inertial 
coordinate system does not depend on the time: 


X = O(x, X). 


Remark. Differential equations in which the right-hand side does depend 
on time arise in the following situation. 

Suppose that we are studying part I of the mechanical system I + II. 
Then the influence of part II on part I can sometimes be replaced by a time 
variation of parameters in the system of equations describing the motion of 
part I. For example, the influence of the moon on the earth can be ignored in 
investigating the majority of phenomena on the earth. However, in the study of 
the tides this influence must be taken into account; one can achieve this by 
introducing, instead of the attraction of the moon, periodic changes in the 
strength of gravity on earth. 


9 In formulating the principle of relativity we must keep in mind that it is relevant only to 
closed physical (in particular, mechanical) systems, i.e., that we must include in the system all 
bodies whose interactions play a role in the study of the given phenomena. Strictly speaking, we 
should include in the system all bodies in the universe. But we know from experience that one 
can disregard the effect of many of them: for example, in studying the motion of planets around 
the sun we can disregard the attractions among the stars, etc. 

On the other hand, in the study of a body in the vicinity of earth, the system is not closed 
if the earth is not included; in the study of the motion of an airplane the system is not closed if 
it does not include the air surrounding the airplane, etc. In the future, the term “mechanical 
system” will mean a closed system in most cases, and when there is a non-closed system in 
question this will be explicitly stated (cf., for example, Section 3). 


1: Experimental facts 


Equations with variable coefficients can appear also as the result of formal 
operations in the solution of problems. 


EXAMPLE 2. Translations in three-dimensional space are galilean trans- 
formations. Invariance with respect to such translations means that space 
is homogeneous, or “has the same properties at all of its points.” That is, 
if x; = @(t)(i = 1,..., n) is a motion of a system of n points satisfying (1), 
then for any r € R? the motiong (t) + r(i = 1,...,n) also satisfies Equation 
(1). 

From this it follows that the right-hand side of Equation (1) in the inertial 
coordinate system can depend only on the “relative coordinates” x; — x,. 

From invariance under passage to a uniformly moving coordinate system 
(which does not change x; or x; — x,, but adds to each x; a fixed vector v) it 
follows that the right-hand side of Equation (1) in an inertial system of 
coordinates can depend only on the relative velocities 


K; = f({x; — x,, &; — &}), i,j,k =1,...,0. 


EXAMPLE 3. Among the galilean transformations are the rotations in three- 
dimensional space. Invariance with respect to these rotations means that 
space is isotropic; there are no preferred directions. 

Thus, if @;: RR > R3(i = 1,..., n) is a motion of a system of points satis- 
fying (1), and G: R? > R? is an orthogonal transformation, then the motion 
Gq@;: R > R3(i, ..., n) also satisfies (1). In other words. 


F(Gx, G x) = GF(x, x), 


where Gx denotes (Gx,,..., Gx,), x; € R?. 


PROBLEM. Show that if a mechanical system consists of only one point, then 
its acceleration in an inertial coordinate system is equal to zero (“ Newton’s 
first law”). 

Hint. By Examples 1 and 2 the acceleration vector does not depend on 
x, X, or t, and by Example 3 the vector F is invariant with respect to rotation. 


PROBLEM. A mechanical system consists of two points. At the initial moment 
their velocities (in some inertial coordinate system) are equal to zero. Show 
that the points will stay on the line which connected them at the initial 
moment. 


PROBLEM. A mechanical system consists of three points. At the initial moment 
their velocities (in some inertial coordinate system) are equal to zero. 
Show that the points always remain in the plane which contained them at the 
initial moment. 


PROBLEM. A mechanical system consists of two points. Show that for any 
initial conditions there exists an inertial coordinate system in which the 
two points remain in a fixed plane. 
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3: Examples of mechanical systems 


PROBLEM. Show that mechanics “through the looking glass” is identical 
to ours. 

Hint. In the galilean group there is a reflection transformation, changing 
the orientation of R°. 


PROBLEM. Is the class of inertial systems unique? 


ANSwER. No. Other classes can be obtained if one changes the units of length 
and time or the direction of time. 


3 Examples of mechanical systems 


We have already remarked that the form of the function F in Newton’s equation (1) is determined 
experimentally for each mechanical system. Here are several examples. 

In examining concrete systems it is reasonable not to include all the objects of the universe 
in a system. For example, in studying the majority of phenomena taking place on the earth we 
can ignore the influence of the moon. Furthermore, it is usually possible to disregard the effect 
of the processes we are studying on the motion of the earth itself; we may even consider a coordi- 
nate system attached to the earth as “fixed.” It is clear that the principle of relativity no longer 
imposes the constraints found in Section 2 for equations of motion written in such a coordinate 
system. For example, near the earth there is a distinguished direction, the vertical. 


A Example 1: A stone falling to the earth 


Experiments show that 
(2) X = —g, where g ~ 9.8 m/s? (Galileo)* 


where x is the height of a stone above the surface of the earth. 
If we introduce the “potential energy” U = gx, then Equation (2) can 
be written in the form 


If U: EX > R is a differentiable function on euclidean space, then we will 
denote by 0U/dx the gradient of the function U. If EY = E™ x... x E™ 
is a direct product of euclidean spaces, then we will denote a point x € E® 
by (x;,..., X,), and the vector 6U/dx by (0U/éx,, ..., 0U/6x;,). In particular, 
if X,,..., Xy are cartesian coordinates in E%, then the components of the 
vector 0U/0x are the partial derivatives 0U/0x,,..., 0U/Oxy. 

Experiments show that the radius vector of the stone with respect to 
some point 0 on the earth satisfies the equation 


- 0U 
(3) X= ae ee where U = —(g, x) 


* In this and other sections, the mass of a particle is taken to be 1. 
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1: Experimental facts 


The vector in the right-hand side is directed towards the earth. It is called 
the gravitational acceleration vector g. (Figure 6.) 


Figure 6 A stone falling to the earth 


B Example 2: Falling from great height 


Like all experimental facts, the law of motion (2) has a restricted domain of 
application. According to a more precise law of falling bodies, discovered 
by Newton, acceleration is inversely proportional to the square of the distance 
from the center of the earth: 


where r = ro + x (Figure 7). 


Figure 7 The earth’s gravitational field 


This equation can also be written in the form (3), if we introduce the 
potential energy 


k 
U=-- k-=gri, 
r 
inversely proportional to the distance to the center of the earth. 


PROBLEM. Determine with what velocity a stone must be thrown in order that 
it fly infinitely far from the surface of the earth.'° 


ANSWER. > 11.2 km/sec. 


10 This is the so-called second cosmic velocity v,. Our equation does not take into account the 
attraction of the sun. The attraction of the sun will not let the stone escape from the solar system 
if the velocity of the stone with respect to the earth is less than 16.6 km/sec. 
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3: Examples of mechanical systems 


C Example 3: Motion of a weight along a line 
under the action of a spring 


Experiments show that under small extensions of the spring the equation 
of motion of the weight will be (Figure 8) 


N 


Figure 8 Weight on a spring 


This equation can also be written in the form (3) if we introduce the 
potential energy 


If we replace our one weight by two weights, then it turns out that, under 
the same extension of the spring, the acceleration is half as large. 

It is experimentally established that for any two bodies the ratio of the 
accelerations X,/X, under the same extension of a spring is fixed (does not 
depend on the extent of extension of the spring or on its characteristics, but 
only on the bodies themselves). The value inverse to this ratio is by definition 
the ratio of masses: 


For a unit of mass we take the mass of some fixed body, e.g., one liter of 
water. We know by experience that the masses of all bodies are positive. The 
product of mass times acceleration mX does not depend on the body, and 
is a characteristic of the extension of the spring. This value is called the 
force of the spring acting on the body. 

As a unit of force, we take the “newton.” If one liter of water is suspended 
on a spring at the surface of the earth, the spring acts with a force of 9.8 
newtons (=1 kg). 


D Example 4: Conservative systems 


Let E*" = E? x --- x E? be the configuration space of a system of n points 
in the euclidean space E*. Let U: E*" > R be a differentiable function and 
let m,,..., m, be positive numbers. 
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1: Experimental facts 


Definition. The motion of n points, of masses m,,...,m,, in the potential 
field with potential energy U is given by the system of differential equations 


_ au 


i 


(4) m;X; = 


The equations of motion in Examples | to 3 have this form. The equations 
of motion of many other mechanical systems can be written in the same form. 
For example, the three-body problem of celestial mechanics is problem (4) 
in which 

mm m,mM3 m3m, 
Ix; —X2l) x2 — xsl] x3 — xy 
Many different equations of entirely different origin can be reduced to 


form (4), for example the equations of electrical oscillations. In the following 
chapter we will study mainly systems of differential equations in the form (4). 


Investigation of the equations 
of motion 


In most cases (for example, in the three-body problem) we can neither solve 
the system of differential equations nor completely describe the behavior 
of the solutions. In this chapter we consider a few simple but important 
problems for which Newton’s equations can be solved. 


4 Systems with one degree of freedom 


In this paragraph we study the phase flow of the differential equation (1). A look at the graph of 
the potential energy is enough for a qualitative analysis of such an equation. In addition, Equation 
(1) is integrated by quadratures. 


A. Definitions 
A system with one degree of freedom is a system described by one differential 
equation 


(1) X = f(x) xeER. 
The kinetic energy is the quadratic form* 
T = 4x?, 


The potential energy is the function 
ue) = - [ sae, 


The sign in this formula is taken so that the potential energy of a stone is 
larger if the stone is higher off the ground. 

Notice that the potential energy determines f. Therefore, to specify a 
system of the form (1) it is enough to give the potential energy. Adding a 
constant to the potential energy does not change the equation of motion (1). 


* see footnote on p. 11. 


2: Investigation of the equations of motion 


The total energy is the sum 
E=T+U. 
In general, the total energy is a function, E(x, x), of x and x. 
Theorem (The law of conservation of energy). The total energy of points 


moving according to the equation (1) is conserved : E(x(t), X(t)) is independent 


of t. 


PROOF. 
f(T + U)= 88+ Se =e — soo) =0. Oo 
B Phase flow 
Equation (1) is equivalent to the system of two equations: 
(2) X=y y= f(x). 


We consider the plane with coordinates x and y, which we call the phase plane 
of Equation (1). The points of the phase plane are called phase points. The 
right-hand side of (2) determines a vector field on the phase plane, called the 
phase velocity vector field. 

A solution of (2) is a motion @: R — R? of a phase point in the phase 
plane, such that the velocity of the moving point at each moment of time is 
equal to the phase velocity vector at the location of the phase point at that 
moment.'! 

The image of @ is called the phase curve. Thus the phase curve is given by 
the parametric equations 


x=(t) y= Gt). 
PROBLEM. Show that through every phase point there is one and only one 


phase curve. 
Hint. Refer to a textbook on ordinary differential equations. 


We notice that a phase curve could consist of only one point. Such a 
point is called an equilibrium position. The vector of phase velocity at an 
equilibrium position is zero. 

The law of conservation of energy allows one to find the phase curves 
easily. On each phase curve the value of the total energy is constant. Therefore, 
each phase curve lies entirely in one energy level set E(x, y) = h. 

C Examples 


EXAMPLE 1. The basic equation of the theory of oscillations is 
X= —x, 


1! Here we assume for simplicity that the solution @ is defined on the whole time axis R. 
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4: Systems with one degree of freedom 


x 
Figure 9 Phase plane of the equation X = —x 
In this case (Figure 9) we have: 
2 2 2 2 
x x xe x 
T= > U=— E=—>—+-—. 
2 2 a 


The energy level sets are the concentric circles and the origin. The phase 
velocity vector at the phase point (x, y) has components (y, —x). It is 
perpendicular to the radius vector and equal to it in magnitude. Therefore, 
the motion of the phase point in the phase plane is a uniform motion around 
0: x = ro COS(@p — t), y = ro SiN(Po — t). Each energy level set is a phase 
curve. 


EXAMPLE 2. Suppose that a potential energy is given by the graph in Figure 
10. We will draw the energy level sets 4y” + U(x) = E. For this, the following 
facts are helpful. 


1. Any equilibrium position of (2) must lie on the x axis of the phase plane. 
The point x = €, y = 0 is an equilibrium position if ¢ is a critical point 
of the potential energy, i.e., if (@U/0x)|,=~¢ = 0. 

2. Each level set is a smooth curve in a neighborhood of each of its points 
which is not an equilibrium position (this follows from the implicit 
function theorem). In particular, if the number E is not a critical value of 
the potential energy (i.c., is not the value of the potential energy at one of 
its critical points), then the level set on which the energy is equal to E 
is a smooth curve. 


It follows that in order to study the energy level curve, we should turn 
our attention to the critical and near-critical values of E. It is convenient 
here to imagine a little ball rolling in the potential well U. 

For example, consider the following argument: “Kinetic energy is 
nonnegative. This means that potential energy is less than or equal to the 
total energy. The smaller the potential energy, the greater the velocity.” 
This translates to: “The ball cannot jump out of the potential well, rising 
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2: Investigation of the equations of motion 


Figure 10 Potential energy and phase curves 


higher than the level determined by its initial energy. As it falls into the well, 
the ball gains velocity.” We also notice that the local maximum points of the 
potential energy are unstable, but the minimum points are stable equilibrium 
positions. 


PROBLEM. Prove this. 


PRrosBLemM. How many phase curves make up the separatrix (figure eight) 
curve, corresponding to the level E, ? 


ANSWER. Three. 
PRoBLEM. Determine the duration of motion along the separatrix. 
ANSWER. It follows from the uniqueness theorem that the time is infinite. 


Pros_em. Show that the time it takes to go from x, to x, (in one direction) 
is equal to 


a dx 
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4: Systems with one degree of freedom 


U U 
Me ww x 
(a) (b) 


Figure 11 Potential energy 


PROBLEM. Draw the phase curves, given the potential energy graphs in 
Figure 11. 


ANSWER. Figure 12. 


x x 
ef. <a 
(a) (b) 


Figure 12 Phase curves 


PROBLEM. Draw the phase curves for the “equation of an ideal planar 
pendulum”: X¥ = —sin x. 


PRoBLEM. Draw the phase curves for the “equation of a pendulum on a 
rotating axis”: X = —sinx + M. 


Remark. In these two problems x denotes the angle of displacement of the 
pendulum. The phase points whose coordinates differ by 2x correspond to 
the same position of the pendulum. Therefore, in addition to the phase plane, 
it is natural to look at the phase cylinder {x(mod 2z), y}. 


PROBLEM. Find the tangent lines to the branches of the critical level corre- 
sponding to maximal potential energy E = U(é) (Figure 13). 


ANSwER. y = + ,/—U"(6)(x — 6). 


2: Investigation of the equations of motion 


Sq 


< 


Figure 13 Critical energy level lines 


PROBLEM. Let S(E) be the area enclosed by the closed phase curve cor- 
responding to the energy level E. Show that the period of motion along 
this curve is equal to 


dS 


PROBLEM. Let Ey be the value of the potential function at a minimum point 
€. Find the period Ty = limg..z, T(E) of small oscillations in a neighbor- 
hood of the point €. 


ANSWER. 27/./U"(6). 


ProsLeM. Consider a periodic motion along the closed phase curve corre- 
sponding to the energy level E. Is it stable in the sense of Liapunov??? 


Answer. No.!3 


D Phase flow 


Let M be a point in the phase plane. We look at the solution to system (2) 
whose initial conditions at t = 0 are represented by the point M. We assume 
that any solution of the system can be extended to the whole time axis. The 
value of our solution at any value of t depends on M. We denote the resulting 
phase point (Figure 14) by 
M(t) = g'M. 

In this way we have defined a mapping of the phase plane to itself, 

g': R?  R?. By theorems in the theory of ordinary differential equations, 


12 For a definition, see, e.g., p. 155 of Ordinary Differential Equations by V. 1. Arnold, MIT Press, 
1973. 


13 The only exception is the case when the period does not depend on the energy. 
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4: Systems with one degree of freedom 


Figure 14 Phase flow 


the mapping g’ is a diffeomorphism (a one-to-one differentiable mapping 
with a differentiable inverse). The diffeomorphisms g’, t € R, form a group: 
g'** =g'°g’. The mapping g° is the identity (g°9M = M), and g“' is the 
inverse of g'. The mapping g:R x R? > R?, defined by g(t, M) = g'M is 
differentiable. All these properties together are expressed by saying that the 
transformations g‘ form a one-parameter group of diffeomorphisms of the phase 
plane. This group is also called the phase flow, given by system (2) (or 
Equation (1)). 


EXAMPLE. The phase flow given by the equation ¥ = —x is the group g' 
of rotations of the phase plane through angle t around the origin. 


PROBLEM. Show that the system with potential energy U = —x* does not 
define a phase flow. 


PrRoBLeEM. Show that if the potential energy is positive, then there is a phase 
flow. 

Hint. Use the law of conservation of energy to show that a solution can 
be extended without bound. 


PROBLEM. Draw the image of the circle x? + (y — 1)? < 4 under the action 


of a transformation of the phase flow for the equations (a) of the “inverse 
pendulum,” % = x and (b) of the “nonlinear pendulum,” ¥ = —sin x. 


ANSWER. Figure 15. 


(a) (b) 


Figure 15 Action of the phase flow on a circle 
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2: Investigation of the equations of motion 


5 Systems with two degrees of freedom 


Analyzing a general potential system with two degrees of freedom is beyond the capability 
of modern science. In this paragraph we look at the simplest examples. 


A. Definitions 


By a system with two degrees of freedom we will mean a system defined by 
the differential equations 


(1) x = f(x), x € E?, 


where f is a vector field on the plane. 

A system is said to be conservative if there exists a function U: E? > R 
such that f = —dU/éx. The equation of motion of a conservative system 
then has the form!* x = —dU/Ox. 


B The law of conservation of energy 


Theorem. The total energy of a conservative system is conserved, i.e., 


= =0, where E = 4x? + U(x), x? = (x, x). 
Proor. dE/dt = (x, X) + (0U/0x, x) = (X + (0U/éx), x) = 0 by the equation 
of motion. O 


Corollary. If at the initial moment the total energy is equal to E, then all 
trajectories lie in the region where U(x) < E, i.e., a point remains inside 
the potential well U(x,, x2) < E for all time. 


Remark. In a system with one degree of freedom it is always possible to 
introduce the potential energy . 


ux) = - [ sou. 
For a system with two degrees of freedom this is not so. 


PROBLEM. Find an example of a system of the form X = f(x), x € E?, which is 
not conservative. 


C Phase space 
The equation of motion (1) can be written as the system: 


X=) X2 = V2 
(2) ; aU : aU 
yr ox, y2= ax, 
14 In cartesian coordinates on the plane E?, ¥, = —0U/dx, and X,; = —0U/0x2. 
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5: Systems with two degrees of freedom 


The phase space of a system with two degrees of freedom is the four- 
dimensional space with coordinates x,, x2, yi, and y. 

The system (2) defines the phase velocity vector field in four space as well 
as'* the phase flow of the system (a one-parameter group of diffeomorphisms 
of four-dimensional phase space). The phase curves of (2) are subsets of four- 
dimensional phase space. All of phase space is partitioned into phase curves. 
Projecting the phase curves from four space to the x,, x, plane gives the 
trajectories of our moving point in the x,, x, plane. These trajectories are 
also called orbits. Orbits can have points of intersection even when the phase 
curves do not intersect one another. The equation of the law of conservation 
of energy 

oy} 


E=— +U()= 


yi + y3 


5) + U(x,, x2) 


defines a three-dimensional hypersurface in four space: E(x,,X2, ¥3, ¥2) = 
Eo; this surface, z,,, remains invariant under the phase flow: g'ng, = 7,. 
One could say that the phase flow flows along the energy level hypersurfaces. 
The phase velocity vector field is tangent at every point to z,,. Therefore, 
Tg, 18 entirely composed of phase curves (Figure 16). 


V2 


Vy 


x; 
X2 


‘Figure 16 Energy level surface and phase curves 


EXAMPLE | (“small oscillations of a spherical pendulum”). Let U = 4(x? + x3). 
The level sets of the potential energy in the x,, x, plane will be concentric 
circles (Figure 17). 


The equations of motion, X, = —x,, ¥, = —x,, are equivalent to the 
system 
X= yy X2 = V2 
ype = oh 2 cama oe 


This system decomposes into two independent ones; in other words, 
each of the coordinates x, and x, changes with time in the same way as in 
a system with one degree of freedom. 


‘5 With the usual limitations. 
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2: Investigation of the equations of motion 


x2 


x] 


Figure 17 Potential energy level curves for a spherical pendulum 


A solution has the form 
xX; =c, cost +c, sint xX, =c3cost+c,sint 
y, = —c, sint +c) cost yo = —c3 Sint + cy cost. 
It follows from the law of conservation of energy that 
E = 40f + y2) + 3x7 + x3) = const, 
ie., the level surface ,, is a sphere in four space. 
PROBLEM. Show that the phase curves are great circles of this sphere. (A 


great circle is the intersection of a sphere with a two-dimensional plane 
passing through its center.) 


PROBLEM. Show that the set of phase curves on the surface z,, forms a two- 
dimensional sphere. The formula w = (x; + iy,)/(x2 + iy2) gives the “Hopf 
map” from the three sphere 7g, to the two sphere (the complex w-plane 
completed by the point at infinity). Our phase curves are the pre-images 
of points under the Hopf map. 


PROBLEM. Find the projection of the phase curves on the x,, x2 plane (ie., 
draw the orbits of the motion of a point). 


EXAMPLE 2 (“Lissajous figures”). We look at one more example of a planar 
motion (“small oscillations with two degrees of freedom”): 
x, = xX, x, = —w?x,. 
The potential energy is 
U = 3x} + 47x}. 


From the law of conservation of energy it follows that, if at the initial 
moment of time the total energy is 


3(X7 + XZ) + U(x, x2) = E, 


then all motions will take place inside the ellipse U(x,, x2) < E. 
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5: Systems with two degrees of freedom 


Our system consists of two independent one-dimensional systems. There- 
fore, the law of conservation of energy is satisfied for each of them separately, 
i.e., the following quantities are preserved 


Ey = 5X3 + 3x} E, = 5X3 + $@°x} (E = E, + E,). 
Consequently, the variable x, is bounded by the region |x,| < Ay, Ai = 


./2E,(0), and x, oscillates within the region |x| < Az. The intersection 
of these two regions defines a rectangle which contains the orbits (Figure 18). 


Figure 18 The regions U < E,U, < EandU,<E 
PROBLEM. Show that this rectangle is inscribed in the ellipse U < E. 


The general solution of our equations is x, = A, sin(t + 91), x2 = 
A, sin(wt + @2); a moving point independently performs an oscillation 
with frequency 1 and amplitude A, along the horizontal and an oscillation 
with frequency w and amplitude A, along the vertical. 

Consider the following method of describing an orbit in the x,, x, plane. 
We look at a cylinder with base 2A, and a band of width 2A,. We draw on 
the band a sine wave with period 27A,/m and amplitude A, and wind the 
band onto the cylinder (Figure 19). The orthogonal projection of the sinusoid 
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Figure 19 Construction of a Lissajous figure 
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2: Investigation of the equations of motion 


wound around the cylinder onto the x,, x. plane gives the desired orbit, 
called a Lissajous figure. 

Lissajous figures can conveniently be seen on an oscilloscope which dis- 
plays independent harmonic oscillations on the horizontal and vertical axes. 

The form of a Lissajous figure very strongly depends on the frequency w. 
If @ = 1 (the spherical pendulum of Example 1), then the curve on the 
cylinder is an ellipse. The projection of this ellipse onto the x,, x, plane 
depends on the difference ~, — @, between the phases. For ,; = @, we get 
a segment of the diagonal of the rectangle; for small @, — , we get an 
ellipse close to the diagonal and inscribed in the rectangle. For p, — @, = 2/2 
we get an ellipse with major axes x,, x2; aS M2 — , increases from 1/2 
to z the ellipse collapses onto the second diagonal; as g, — @, increases 
further the whole process is repeated from the beginning (Figure 20). 


x2 


x] 


Figure 20 Series of Lissajous figures with @ = 1 


Now let the frequencies be only approximately equal: w ~ 1. The segment 
of the curve corresponding to 0 < t < 27 is very close to an ellipse. The next 
loop also reminds one of an ellipse, but here the phase shift @, — @, is 
greater than in the original by 2(w — 1). Therefore, the Lissajous curve 
with @ x 1 is a distorted ellipse, slowly progressing through all phases 
from collapsed onto one diagonal to collapsed onto the other (Figure 21). 

If one of the frequencies is twice the other (w = 2), then for some particular 
phase shift the Lissajous figure becomes a doubly traversed arc (Figure 22). 


x2 


Figure 21 Lissajous figure with w ~ 1 
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5: Systems with two degrees of freedom 


PROBLEM. Show that this curve is a parabola. By increasing the phase shift 
2 — @, we get in turn the curves in Fig. 23. 


In general, if one of the frequencies is n times bigger than the other (w = n), 
then among the graphs of the corresponding Lissajous figures there is the 
graph of a polynomial of degree n (Figure 24); this polynomial is called a 
Chebyshev polynomial. 
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Figure 22 Lissajous figure with w = 2 
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Figure 23 Series of Lissajous figures with @ = 2 
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Figure 24 Chebyshev polynomials 
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2: Investigation of the equations of motion 


PROBLEM. Show that if @ = m/n, then the Lissajous figure is a closed algebraic 
curve; but if @ is irrational, then the Lissajous figure fills the rectangle every- 
where densely. What does the corresponding phase trajectory fill out? 


6 Conservative force fields 


In this section we study the connection between work and potential energy. 


A Work of a force field along a path 


Recall the definition of the work by a force F on a path S. The work of the 
constant force F (for example, the force with which we lift up a load) on the 


M2 


M, 
Figure 25 Work of the constant force F along the straight path S 


path S = M,M, is, by definition, the scalar product (Figure 25) 
A = (F,S) = |F|[S|-cos 9. 


Suppose we are given a vector field F and a curve | of finite length. We 
approximate the curve | by a polygonal line with components AS, and denote 
by F; the value of the force at some particular point of AS,; then the work of 
the field F on the path | is by definition (Figure 26) 


A= lim Y (F,, AS). 


|AS;|>0 


In analysis courses it is proved that if the field is continuous and the path 
rectifiable, then the limit exists. It is denoted by J, (F, dS). 


Figure 26 Work of the force field F along the path | 
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6: Conservative force fields 


B Conditions for a field to be conservative 


Theorem. A vector field F is conservative if and only if its work along any 
path M,M, depends only on the endpoints of the path, and not on the shape 
of the path. 


Proor. Suppose that the work of a field F does not depend on the path. Then 


M 
U(M) = — | (F,dS) 


Mo 
is well defined as a function of the point M. It is easy to verify that 
oU 


ie., the field is conservative and U is its potential energy. Of course, the 
potential energy is defined only up to the additive constant U(Mo), which 
can be chosen arbitrarily. 

Conversely, suppose that the field F is conservative and that U is its 
potential energy. Then it is easily verified that 


"E, dS) = —U(M) + U(M,)), 
Mo 


i.e., the work does not depend on the shape of the path. O 
PROBLEM. Show that the vector field F, = x,, F, = — x, is not conservative 
(Figure 27). 


Figure 27. A non-potential field 


PROBLEM. Is the field in the plane minus the origin given by F, = x,/(x7 + x3), 
F, = —x,/(x7 + x3) conservative? Show that a field is conservative if and 
only if its work along any closed contour is equal to zero. 


C Central fields 


Definition. A vector field in the plane E? is called central with center at 0, 
if it is invariant with respect to the group of motions'® of the plane 
which fix 0. 


16 Including reflections. 
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2: Investigation of the equations of motion 


PROBLEM. Show that all vectors of a central field lie on rays through 0, and 
that the magnitude of the vector field at a point depends only on the distance 
from the point to the center of the field. 

It is also useful to look at central fields which are not defined at the point 0. 


EXAMPLE. The newtonian field F = —k(r/|r|°) is central, but the field in 
the problem in Section 6B is not. 


Theorem. Every central field is conservative, and its potential energy depends 
only on the distance to the center of the field, U = U(r). 


Proor. According to the previous problem, we may set F(r) = (r)e,, 
where r is the radius vector with respect to 0, r is its length and the unit 
vector e, = r/|r| its direction. Then 


M2 r(M2) 
| (F, dS) = | O(r)dr, 
M, r(M1) 


and this integral is obviously independent of the path. O 


PROBLEM. Compute the potential energy of the newtonian field. 


Remark. The definitions and theorems of this paragraph can be directly 
carried over to a euclidean space E” of any dimension. 


7 Angular momentum 


We will see later that the invariance of an equation of a mechanical problem with respect to some 
group of transformations always implies a conservation law. A central field is invariant with 
respect to the group of rotations. The corresponding first integral is called the angular momen- 
tum. 


Definition. The motion of a material point (with unit mass) in a central field 
on a plane is defined by the equation 


¥ = Or, 


where r is the radius vector beginning at the center of the field 0, r is 
its length, and e, its direction. We will think of our plane as lying in three- 
dimensional oriented euclidean space. 


Definition. The angular momentum of a material point of unit mass relative 
to the point 0 is the vector product 
M = [r, F]. 


The vector M is perpendicular to our plane and is given by one number: 
M = Mn, where n = [e,, €,] is the normal vector, e,; and e, being an 
oriented frame in the plane (Figure 28). 
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7: Angular momentum 


Figure 28 Angular momentum 


Remark. In general, the moment of a vector a “applied at the point r” 
relative to the point 0 is [r, a]; for example, in a school statics course one 
studies the moment of force. [The literal translation of the Russian term for 
angular momentum is “kinetic moment.” (Trans. note) ] 


A The law of conservation of angular momentum 
Lemma. Let a and b be two vectors changing with time in the oriented euclidean 
space R>. Then 
d ; : 
dt [a, b] = [a, b] + [a, bj. 


ProoF. This follows from the definition of derivative. O 


Theorem (The law of conservation of angular momentum). Under motions 
in a central field, the angular momentum M relative to the center of the 
field 0 does not change with time. 


Proor. By definition M = [r, f]. By the lemma, M = [f, F] + [1, £]. Since 
the field is central it is apparent from the equations of motion that the vectors 
f and r are collinear. Therefore M = 0. CJ 


B Kepler’s law 


The law of conservation of angular momentum was first discovered by 
Kepler through observation of the motion of Mars. Kepler formulated this 
law in a slightly different way. 

We introduce polar coordinates r, g on our plane with pole at the center 
of the field 0. We consider, at the point r with coordinates (|r| = r, 9), 
two unit vectors: e,, directed along the radius vector so that 


r= re,, 
and e,, perpendicular to it in the direction of increasing ~. We express the 
velocity vector rf in terms of the basis e,, e, (Figure 29). 
Lemma. We have the relation 
r = re, + rge,. 
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2: Investigation of the equations of motion 


0 


Figure 29 Decomposition of the vector f in terms of the basis e,, e, 


Proor. Clearly, the vectors e, and e, rotate with angular velocity ¢, 1.e., 
é, = Ge, é, = —@e,. 
Differentiating the equality r = re, gives us 


r = re, + ré, = fe, + rey. O 


Consequently, the angular momentum is 


M = [r,#] = [r, fe] + [r,rGey] = rolt eg] =roLe,,eg) 


Thus, the quantity M = r’@ is preserved. This quantity has a simple 
geometric meaning. 


r(t + At) r(t) 
5 


Figure 30 Sectorial velocity 


Kepler called the rate of change of the area S(t) swept out by the radius 


vector the sectorial velocity C (Figure 30): 
dS 
C= FF 


The law discovered by Kepler through observation of the motion of the 
planets says: in equal times the radius vector sweeps out equal areas, so 
that the sectorial velocity is constant, dS/dt = const. This is one formulation 
of the law of conservation of angular momentum. Since 


AS = S(t + At) — S(t) = $r7@At + o(Ar), 
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8: Investigation of motion in a central field 


this means that the sectorial velocity 


is half the angular momentum of our point of mass 1, and therefore constant. 


EXAMPLE. Some satellites have very elongated orbits. By Kepler’s law such 
a satellite spends most of its time in the distant part of its orbit, where the 
magnitude of @ is small. 


8 Investigation of motion in a central field 


The law of conservation of angular momentum lets us reduce problems about motion in a 
central field to problems with one degree of freedom. Thanks to this, motion in a central field can 
be completely determined. 


A Reduction to a one-dimensional problem 


We look at the motion of a point (of mass 1) in a central field on the plane: 
F= -— U = UC). 


It is natural to use polar coordinates r, @. 
By the law of conservation of angular momentum the quantity M = 
@(t)r2(t) is constant (independent of f). 


Theorem. For the motion of a material point of unit mass in a central field 
the distance from the center of the field varies in the same way as r varies 
in the one-dimensional problem with potential energy 


M2 
V(r) = U(r) + 3,2 


Proor. Differentiating the relation shown in Section 7 (f = re, + r@e,), 
we find 

iF = (F — rp*)e, + (27@ + r@ey. 
Since the field is central, 


au _ au 


olor 


Therefore the equation of motion in polar coordinates takes the form 


- eu : 
aS alae 2?@ + ro = 0. 
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2: Investigation of the equations of motion 


But, by the law of conservation of angular momentum, 


. M 

9 a 
where M is a constant independent of t, determined by the initial conditions. 
Therefore, 


0U M? : OV M? 
es oeeaie es aes or f= ->], where V = U + 53. 
The quantity V(r) is called the effective potential energy. oO 


Remark. The total energy in the derived one-dimensional problem 
2 
E 11> By + V(r ) 


is the same as the total energy in the original problem 


r 
E= oy + U(r), 
since 
r p2 re? fr M2 
I 2 Oo oe 


B Integration of the equation of motion 


The total energy in the derived one-dimensional problem is conserved. 
Consequently, the dependence of r on t is defined by the quadrature 


7 os 
r= /°E — V(r) fa = | 2(E — Vin) 


Since @ = M/r?, do/dr = (M/r?)/,/2(E — V(r)), and the equation of the 
orbit in polar coordinates is found by quadrature, 


M/r? dr 
o=- [6 


JXE — Vin) 


C Investigation of the orbit 


We fix the value of the angular momentum at M. The variation of r with time 
is easy to visualize, if one draws the graph of the effective potential energy 
V(r) (Figure 31). 

Let E be the value of the total energy. All orbits corresponding to the given 
E and M lie in the region V(r) < E. On the boundary of this region, V = E, 
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8: Investigation of motion in a central field 


r 
"min ‘max 


Figure 31 Graph of the effective potential energy 


i.e.,? = 0. Therefore, the velocity of the moving point, in general, is not equal 
to zero since @ # O for M # 0. 
The inequality V(r) < E gives one or several annular regions in the plane: 


0 < Main <7 Snax XS O- 


in = 
If 0 < Min < max < 00, then the motion is bounded and takes place inside 
the ring between the circles of radius rain ANd Tmax: 


Pericenter 


Apocenter 


Figure 32 Orbit of a point in a central field 


The shape of an orbit is shown in Figure 32. The angle @ varies mono- 
tonically while r oscillates periodically between rai, aNd rmx. The points 
where r = fpi, are called pericentral, and where r = rna,, apocentral (if the 
center is the earth—perigee and apogee; if it is the sun—perihelion and 
aphelion; if it is the moon—perilune and apolune). 

Each of the rays leading from the center to the apocenter or to the peri- 
center is an axis of symmetry of the orbit. 

In general, the orbit is not closed: the angle between the successive 
pericenters and apocenters is given by the integral 


rmax  M/r? dr 
tmin «/ 2E — V(r)) 


The angle between two successive pericenters is twice as big. 


® = 
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2: Investigation of the equations of motion 


Figure 33 Orbit dense in an annulus 


The orbit is closed if the angle ® is commensurable with 22, ie., if ® = 
22(m/n), where m and n are integers. 

It can be shown that if the angle ® is not commensurable with 27, then the 
orbit is everywhere dense in the annulus (Figure 33). 

If fain = max» L-€., E is the value of V at a minimum point, then the annulus 
degenerates to a circle, which is also the orbit. 


PROBLEM. For which values of « is motion along a circular orbit in the field 
with potential energy U = r*, —2 < « < 0, Liapunov stable? 


ANSWER. Only for « = 2. 


For values of E a little larger than the minimum of V the annulus 
Tin <1 S Tmax Will be very narrow, and the orbit will be close to a circle. 
In the corresponding one-dimensional problem, r will perform small oscilla- 
tions close to the minimum point of V. 


PROBLEM. Find the angle ® for an orbit close to the circle of radius r. 
Hint. Cf. Section D below. 


We now look at the case r,,,, = 00. If lim,.,, U(r) = lim,,,, V(r) = 
U., < 0, then it is possible for orbits to go off to infinity. If the initial energy 
E is larger than U, then the point goes to infinity with finite velocity r,, = 
,/2(E — U,,). We notice that if U(r) approaches its limit slower than r~?, 
then the effective potential V will be attracting at infinity (here we assume that 
the potential U is attracting at infinity). 

If, as r+ 0, |U(r)| does not grow faster than M?/2r?, then r,,,, > 0 and 
the orbit never approaches the center. If, however, U(r) + (M?/2r?) + — 0 
as r — 0, then it is possible to “fall into the center of the field.” Falling into 
the center of the field is possible even in finite time (for example, in the field 
U(r) = —1/r?). 


PROBLEM. Examine the shape of an orbit in the case when the total energy 
is equal to the value of the effective energy V at a local maximum point. 
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8: Investigation of motion in a central field 


D Central fields in which all bounded orbits are 
closed 


It follows from the following sequence of problems that there are only two 
cases in which all the bounded orbits in a central field are closed, namely, 


U = ar’, a>0 
and 


PROBLEM |. Show that the angle ® between the pericenter and apocenter 
is equal to the semiperiod of an oscillation in the one-dimensional system 
with potential energy W(x) = U(M/x) + (x?/2). 
Hint. The substitution x = M/r gives 
o- |" = W) 
PROBLEM 2. Find the angle ® for an orbit close to the circle of radius r. 


ANSWER. ® = ®,;, = 1(M/r?,/V"(r)) = 2,/U'/(3U' + rU"). 


PROBLEM 3. For which values of U is the magnitude of ®,,, independent of the 
radius r? 


Answer. U(r) = ar* (a > —2,a 4 0) and U(r) = b logr. 
It follows that ®,,, = 2/,./a + 2 (the logarithmic case corresponds to 
a = 0). For example, for « = 2 we have ®,;, = 2/2, and for « = —1 we have 


®,;, = 1. 


PRoBLEM 4. Let in the situation of problem 3 U(r) 00 as r> oo. Find 
limy..,, P(E, M). 


ANSWER. 71/2. 


Hint. The substitution x = yx,,a, reduces ® to the form 


=U : 
0- | ae aR / we, = W*(y)), = at Xen YX max 


As E > co we have Xmax > 00 and Yin > 0, and the second term in W* can 
be discarded. 
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PROBLEM 5. Let U(r) = —kr~*, 0 < B < 2. Find ®, = lim,z.._,®. 


ANSWER. ®y = {6 dx/,/x* — x? = n/(2 — B). Note that ®) does not depend 
on M. 


PROBLEM 6. Find all central fields in which bounded orbits exist and are all 
closed. 


Answer. U = ar? or U = —k/r. 


Solution. If all bounded orbits are closed, then, in particular, ®,,, = 
2x(m/n) = const. According to Problem 3, U = ar*%(a > —2),orU =bl|nr 
(a = 0). In both cases ®,,, = 2/./a + 2. Ifa > 0, then according to Problem 
4, lim,..,. P(E, M) = 2/2. Therefore, ®,,,= 2/2, a=2. If «<0, then 
according to Problem 5, limg._,, P(E, M) =2/(2 +). Therefore, 
n/(2 + a) = n/,/2 +a, «= —1. In the case « = 0 we find ©,,, = n/./2, 
which is not commensurable with 2x. Therefore, all bounded orbits can be 
closed only in fields where U = ar? or U = —k/r. In the field U = ar’, 
a > 0, all the orbits are closed (these are ellipses with center at 0, cf. Example 
1, Section 5). In the field U = —k/r all bounded orbits are also closed and 
also elliptical, as we will now show. 


E Kepler’s problem 


This problem concerns motion in a central field with potential U = —k/r 
and therefore V(r) = —(k/r) + (M?/2r7) (Figure 34). 
By the general formula 


M/r? dr 
o={ 


JE — Vir) 


Figure 34 Effective potential of the Kepler problem 
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Integrating, we get 


M k 
+r M 
@ = arc cos am 


To this expression we should have added an arbitrary constant. We 
will assume it equal to zero; this is equivalent to the choice of an origin of 
reference for the angle @ at the pericenter. We introduce the following 


notation: 
M? = i 2EM? _ 
ere we © 


Now we get » = arc cos ((p/r) — 1)/e, e., 


pn Es 
1+ ecosg 


This is the so-called focal equation of a conic section. The motion is bounded 
(Figure 35) for E < 0. Then e < 1, ie., the conic section is an ellipse. The 
number p is called the parameter of the ellipse, and e the eccentricity. Kepler’s 
first law, which he discovered by observing the motion of Mars, consists 
of the fact that the planets describe ellipses, with the sun at one focus. 


Figure 35 Keplerian ellipse 


If we assume that the planets move in a central field of gravity, then 
Kepler’s first law implies Newton’s law of gravity: U = —(k/r) (cf. Section 
2D above). 

The parameter and eccentricity are related with the semi-axes by the 
formulas 


P Pp 2p 
2a = = 
si ine ee 1 — e?’ 
1.€., 
p 
Oe Rasiae 


e = c/a = ,/a* — b*/a, where c = ae is the distance from the center to 
the focus (cf. Figure 35). 
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Remark. An ellipse with small eccentricity is very close to a circle.'’ 
If the distance from the focus to the center is small of first order, then the 
difference between the semi-axes is of second order: b = a,/1 — e? = 
a(1 — $e”). For example, in the ellipse with major semi-axes of 10 cm and 
eccentricity 0.1, the difference of the semi-axes is 0.5 mm, and the distance 
between the focus and the center is 1 cm. 

The eccentricities of planets’ orbits are very small. Therefore, Kepler 
originally formulated his first law as follows: the planets move around the 
sun in circles, but the sun is not at the center. 

Kepler’s second law, that the sectorial velocity is constant, is true in any 
central field. 

Kepler’s third law says that the period of revolution around an elliptical 
orbit depends only on the size of the major semi-axes. 

The squares of the revolution periods of two planets on different elliptical 
orbits have the same ratio as the cubes of their major semi-axes.1® 


Proor. We denote by T the period of revolution and by S the area swept 
out by the radius vector in time T. 2S = MT, since M/2 is the sectorial 
velocity. But the area of the ellipse, S, is equal to mab, so T = 2nab/M. Since 


— Mik ik 
26M aie 
(from a = p/(1 — e?)), and 
a: ae cee 
fel VAlBl 
then T = 2n(k/(./21E))®); but 21E| = k/a, so T = 2na?k- 2, 4 


We note that the total energy E depends only on the major semi-axis a 
of the orbit and is the same for the whole set of elliptical orbits, from a circle 
of radius a to a line segment of length 2a. 


PROBLEM. At the entry of a satellite into a circular orbit at a distance 300 km 
from the earth the direction of its velocity deviates from the intended direction 
by 1° towards the earth. How is the perigee changed? 


ANSWER. The height of the perigee is less by approximately 110 km. 


17 Let a drop of tea fall into a glass of tea close to the center. The waves collect at the symmetric 
point. The reason is that, by the focal definition of an ellipse, waves radiating from one focus of 
the ellipse collect at the other. 


18 By planets we mean here points in a central field. 
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=, 


Figure 36 An orbit which is close to circular 


Hint. The orbit differs from a circle only to second order, and we can dis- 
regard this difference. The radius has the intended value since the initial 
energy has the intended value. Therefore, we get the true orbit (Figure 36) 
by twisting the intended orbit through 1°. 


Pros_eM. How does the height of the perigee change if the actual velocity 
is 1 m/sec less than intended? 


PROBLEM. The first cosmic velocity is the velocity of motion on a circular 
orbit of radius close to the radius of the earth. Find the magnitude of the 


first cosmic velocity v, and show that v, = 2 v, (cf. Section 3B). 
ANSWER. 8.1 km/sec. 


Pros_em.!® During his walk in outer space, the cosmonaut A. Leonov threw 
the lens cap of his movie camera towards the earth. Describe the motion of 
the lens cap with respect to the spaceship, taking the velocity of the throw 
as 10 m/sec. 


ANSWER. The lens cap will move relative to the cosmonaut approximately 
in an ellipse with major axis about 32 km and minor axis about 16 km. The 
center of the ellipse will be situated 16 km in front of the cosmonaut in his 
orbit, and the period of circulation around the ellipse will be equal to the 
period of motion around the orbit. 

Hint. We take as our unit of length the radius of the space ship’s circular 
orbit, and we choose a unit of time so that the period of revolution around this 
orbit is 2x. We must study solutions to Newton’s equation 


close to the circular solution with rg = 1, @g = t. We seek those solutions 
in the form 


r=rot+ry P=Pot+ GY n<lLo<l. 


1° This problem is taken from V. V. Beletskii’s delightful book, “Sketches on the Motion of 
Celestial Bodies,” Nauka, 1972. 
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By the theorem on the differentiability of a solution with respect to its 
initial conditions, the functions r,(t) and @,(t) satisfy a system of linear 
differential equations (equations of variation) up to small amounts which 
are of higher than first order in the initial deviation. 

By substituting the expressions for r and g in Newton’s equation, we get, 
after simple computation, the variational equations in the form 


r, = 3r, + 26, @, = —2r,. 


After solving these equations for the given initial conditions (r,(0) = 
@,(0) = @,(0) = 0, F,(0) = —(1/800)), we get the answer given above. 


Disregarding the small quantities of second order gives an effect of under 
1/800 of the one obtained (i.e., on the order of 10 meters on one loop). 
Thus the lens cap describes a 30 km ellipse in an hour-and-a-half, returns 
to the space ship on the side opposite the earth, and goes past at the distance 
of a few tens of meters. 

Of course, in this calculation we have disregarded the deviation of the orbit 
from a circle, the effect of forces other than gravity, etc. 


9 The motion of a point in three-space 


In this paragraph we define the angular momentum relative to an axis and we show that, for 
motion in an axially symmetric field, it is conserved. 
All the results obtained for motion in a plane can be easily carried over to motions in space. 


A Conservative fields 
We consider a motion in the conservative field 


dU 


where U = U(r), re E°. 
The law of conservation of energy holds: 


aa = 0, where E = $f? + U(r). 


B Central fields 


For motion ina central field the vector M = [r, fF] does not change: dM/dt = 
0. 
Every central field is conservative (this is proved as in the two-dimensional 


case), and 


ow = (#41 + [#1 = 0, 
since f = —(0U/ér), and the vector 0U/¢r is collinear with r since the field is 


central. 
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Corollary. For motion in a central field, every orbit is planar. 


Proor. (M, r) = ([r, fF], r) = 0; therefore r(t) | M, and since M = const., 
all orbits lie in the plane perpendicular to M.?° O 


Thus the study of orbits in a central field in space reduces to the planar 
problem examined in the previous paragraph. 


PROBLEM. Investigate motion in a central field in n-dimensional euclidean 
space. 


C Axially symmetric fields 


Definition. A vector field in E* has axial symmetry if it is invariant with 
respect to the group of rotations of space which fix every point of some 
axis. 


PROBLEM. Show that if a field is axially symmetric and conservative, then its 
potential energy has the form U = U(r, z), where r, @, and z are cylindrical 
coordinates. 

In particular, it follows from this that the vectors of the field lie in planes 
through the z axis. 


As an example of such a field we can take the gravitational field created 
by a solid of revolution. 


ez 
F 
htt ar 
oe 
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Figure 37 Moment of the vector F with respect to an axis 


Let z be the axis, oriented by the vector e, in three-dimensional euclidean 
space E?; F a vector in the euclidean linear space R?; 0 a point on the z axis; 
r = x — Oe R° the radius vector of the point x € E° relative to 0 (Figure 37). 


Definition. The moment M, relative to the z axis of the vector F applied 
at the point r is the projection onto the z axis of the moment of the vector 
F relative to some point on this axis: 


M, = (., [r, F)). 


20 The case M = O is left to the reader. 
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2: Investigation of the equations of motion 


The number M, does not depend on the choice of the point 0 on the 
z axis. In fact, if we look at a point 0’ on the axis, then by properties of the 
triple product, M7, = (e,, [r’, F]) = ((e,, r'], F) = ((e., rj, F) = M,. 

Remark. M, depends on the choice of the direction of the z axis: if we change 
e, to —e,, then M, changes sign. 


Theorem. For a motion in a conservative field with axial symmetry around the 
z axis, the moment of velocity relative to the z axis is conserved. 


Proor. M, = (e,, [r, t]). Since rf = F, it follows that r and f lie in a plane 


passing through the z axis, and therefore [r, rf] is perpendicular to e,. 
Therefore, 


M, = (e,, [#,#]) + @., [, #]) = 0. oO 


Remark. This proof works for any force field in which the force vector F 
lies in the plane spanned by r and e,. 


10 Motions of a system of 1 points 


In this paragraph we prove the laws of conservation of energy, momentum, and angular momen- 
tum for systems of material points in E>. 


A Internal and external forces 


Newton’s equations for the motion of a system of n material points, with 
masses m, and radius vectors r; € E> are the equations 


m¥; = F; i=1,2,...,n. 
The vector F; is called the force acting on the i-th point. 
The forces F; are determined experimentally. We often observe in a 


system that for two points these forces are equal in magnitude and act 
in opposite directions along the straight line joining the points (Figure 38). 


Figure 38 Forces of interaction 


Such forces are called forces of interaction (example: the force of universal 
gravitation). 

If all forces acting on a point of the system are forces of interaction, then 
the system is said to be closed. By definition, the force acting on the i-th 
point of a closed system is 


F; => » F;,;. 
jal 
j#i 
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10: Motions of a system of n points 


The vector F;; is the force with which the j-th point acts on the i-th. 
Since the forces F;; and F,; are opposite (F;; = —F,,), we can write them 
in the form F;,; = f,;e;;, where f,; = fj; is the magnitude of the force and e¢;; 
is the unit vector in the direction from the i-th point to the j-th point. 
If the system is not closed, then it is often possible to represent the forces 
acting on it in the form 
F,= >) F,,+ Fi, 


where F;, are forces of interaction and F;(r;) is the so-called external force. 


Figure 39 Internal and external forces 


EXAMPLE. (Figure 39) We separate a closed system into two parts, I and II. 
The force F; applied to the i-th point of system I is determined by forces of 
interaction inside system I and forces acting on the i-th point from points 
of system II, i.e., 
F; = Y F; j + F;. 

jel 

j#i 
F; is the external force with respect to system I. 
B The law of conservation of momentum 


Definition. The momentum of a system is the vector 
n 
P = a mM; f;. 
t=1 
Theorem. The rate of change of momentum of a system is equal to the sum 


of all external forces acting on points of the system. 


PROOF. dP/dt = Vre, m;t; = Lie Fi = Fy + 33 F, = VF ey Fy = 
ia 


0, since for forces of interaction F;; = —F,;. 


Corollary 1. The momentum of a closed system is conserved. 


Corollary 2. If the sum of the exterior forces acting on a system is perpendicular 
to the x axis, then the projection P,, of the momentum onto the x axis is 
conserved: P,. = const. 
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2: Investigation of the equations of motion 


Definition. The center of mass of a system is the point 


= y mt; 
ym; 


PROBLEM. Show that the center of mass is well defined, i.e., does not depend 
on the choice of the origin of reference for radius vectors. 


r 


The momentum ofa system is equal to the momentum ofa particle lying at 
the center of mass of the system and having mass )° m,. 

In fact, (° mr = ) (m;r,), from which it follows that (Ym) = Y m;t;. 

We can now formulate the theorem about momentum as a theorem about 
the motion of the center of mass. 


Theorem. The center of mass of a system moves as if all masses were concen- 
trated at it and all forces were applied to it. 
PRooF. (9. m,)t = P. Therefore, (Ym) = dP/dt = ¥; F;. Oo 


Corollary. If a system is closed, then its center of mass moves uniformly 
and linearly. 


C The law of conservation of angular momentum 


Definition. The angular momentum of a material point of mass m relative to the 
point 0, is the moment of the momentum vector relative to 0: 


M = [r, mf]. 


The angular momentum of a system relative to 0 is the sum of the angular 
momenta of all the points in the system: 


M = oF [r;, mjF;]. 
i=1 


Theorem. The rate of change of the angular momentum of a system is equal 
to the sum of the moments of the external forces’ acting on the points of 
the system. 


Proor. dM/dt = )".., [f;, mt] + 17-1 [r;, mf]. The first term is equal 


to zero, and the second is equal to 


5 [te F;] = E (x F;,+ F)) a ¥ [ks F‘], 
i=1 i=1 


1 i#j 


by Newton’s equations. 


21 The moment of force is also called the torque [Trans. note]. 
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10: Motions of a system of n points 


The sum of the moments of two forces of interaction is equal to zero since 
Fj, = — Fj, so Lr; Fi) + Oy), Fy] = (a; — 1), Fi] = 0. 


Therefore, the sum of the moments of all forces of interaction is equal 
to zero: 


i#j 
Therefore, dM/dt = ¥"., [r;, F’]. 0 


Corollary 1 (The law of conservation of angular momentum). If the system 
is closed, then M = const. 


We denote the sum of the moments of the external forces by N = 
det Or, Fi. 
Then, by the theorem above, dM/dt = N, from which we have 


Corollary 2. If the moment of the external forces relative to the z axis is 
equal to zero, then M, is constant. 


D The law of conservation of energy 
Definition. The kinetic energy of a point of mass m is 


mr? 


7 


Definition. The kinetic energy of a system of mass points is the sum of the 
kinetic energies of the points: 


where the m; are the masses of the points and f; are their velocities. 
Theorem. The increase in the kinetic energy of a system is equal to the sum of 
the work of all forces acting on the points of the system. 


PROOF. 


dT Z se Ay i rn Bee 
dt = Lm, i) = v &, mt ;) = Y &, F)). 
i=1 i=1 : 


i=1 


Therefore, 
t dT n t n 
T(t) — T(t) = | —dt=¥ [ a. F)dt = > A;. O 
to at i=1 Yt i=1 
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2: Investigation of the equations of motion 


The configuration space of a system of n mass points in E? is the direct 
product of n euclidean spaces: E*" = E> x --. x E>. It has itself the structure 
of a euclidean space. 

Let r = (r,,...,¥,,) be the radius vector of a point in the configuration 
space, and F = (F,,..., F,,) the force vector. We can write the theorem above 
in the form 


r(t1) ty 
HG) Fe) = dt) = (é, Fat. 
r(to to 


In other words: 
The increase in kinetic energy is equal to the work of the “force” F 
on the “path” r(t) in configuration space. 


Definition. A system is called conservative if the forces depend only on the 
location of a point in the system (F = F(r)), and if the work of F along 
any path depends only on the initial and final points of the path: 


M2 
(F, dr) = ®(M,, M)). 


M1 


Theorem. For a system to be conservative it is necessary and sufficient that 
there exist a potential energy, i.e., a function U(r) such that 


oU 


ProoF. Cf. Section 6B. | 
Theorem. The total energy of a conservative system (E = T + U) is preserved 
under the motion: E(t,) = E(to). 


Proor. By what was shown earlier, 


T(t) 
T(t,) — T(to) = [ (F, dr) = U(r(to)) — U(r(t,)). O 


r(to) 


Let all the forces acting on the points of a system be divided into forces of 
interaction and external forces: 


F,= ¥ Fa; 


i#j 


where F;; — -F; — SFij@ij- 


Proposition. If the forces of interaction depend only on distance, fi; = 
fife; — ¥;|), then they are conservative. 
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10: Motions of a system of n points 


Proor. If a system consists entirely of two points i and j, then, as is easily 
seen, the potential energy of the interaction is given by the formula 


U;,(r) = filo) dp. 


We then have 


OU (Ir; — rl) _ Olt; —¥j| _ 
ae, ae, he 


i 


Therefore, the potential energy of the interaction of all the points will be 


U(r) = Ui Air; — Fj). O 
i>j 
If the external forces are also conservative, ie., F; = —(0U;/ér;), then 


the system is conservative, and its total potential energy is 
U(r) = Uy + ¥ Ui. 
i>j i 
For such a system the total mechanical energy 
r 
U , 
i i>j i 
is conserved. 


If the system is not conservative, then the total mechanical energy is not 
generally conserved. 


Definition. A decrease in the mechanical energy E(t.) — E(t,) is called an 
increase in the non-mechanical energy E’: 


E'(t,) — E'(to) = E(to) — E(t). 


Theorem (The law of conservation of energy). The total energy H = E+ E’ 
is conserved. 


This theorem is an obvious corollary of the definition above. Its value lies 
in the fact that in concrete physical systems, expressions for the size of the 
non-mechanical energy can be found in terms of other physical quantities 
(temperature, etc.). 


E Example: The two-body problem 


Suppose that two points with masses m, and m, interact with potential U, 
so that the equations of motion have the form 


oU ‘ 0U 


ties tae 
1*1 or, 272 or,’ 


U = U(r, — Fp). 
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2: Investigation of the equations of motion 


Theorem. The time variation of r =r, —r, in the two-body problem is the 
same as that for the motion of a point of mass m = m,m,/(m, + m2) ina 
field with potential U(|r|). 


We denote by ry the radius vector of the center of mass: ry = 
(m,r, + m2¥)/(m, + m3). By the theorem on the conservation of momentum, 
the point rp moves uniformly and linearly. 

We now look at the vector r =r, —r,. Multiplying the first of the 
equations of motion by m2, the second by m,, and computing, we find that 
mm, = —(m, + m,)(6U/or), where U = U((r, — r,|) = U((r|). 

In particular, in the case of a Newtonian attraction, the points describe 
conic sections with foci at their common center of mass (Figure 40). 


\ 
i 


<r, 


Figure 40 The two body problem 


PROBLEM. Determine the major semi-axis of the ellipse which the center of 
the earth describes around the common center of mass of the earth and the 
moon. Where is this center of mass, inside the earth or outside? (The mass 
of the moon is 1/81 times the mass of the earth.) 


11 The method of similarity 


In some cases it is possible to obtain important information from the form of the equations of 
motion without solving them, by using the methods of similarity and dimension. The main idea 
in these methods is to choose a change of scale (of time, length, mass, etc.) under which the 
equations of motion preserve their form. 


A Example 


Let r(t) satisfy the equation m(d?r/dt?) = —(0U/dr). We set t, = at and 
m, = «m. Then r(t,) satisfies the equation m, - (d?r/dt?) = —(0U/ér). In 
other words: 

If the mass of a point is decreased by a factor of 4, then the point can travel 
the same orbit in the same force field twice as fast.?? 


?2 Here we are assuming that U does not depend on m. In the field of gravity, the potential 
energy U is proportional to m, and therefore the acceleration does not depend on the mass m 
of the moving point. 
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11: The method of similarity 


B A problem 


Suppose that the potential energy ofa central field is a homogeneous function 
of degree vy: 


U(ar) = «’U(r) for any a > 0. 


Show that if a curve y is the orbit of a motion, then the homothetic 
curve ay is also an orbit (under the appropriate initial conditions). Determine 
the ratio of the circulation times along these orbits. Deduce from this the 
isochronicity of the oscillation of a pendulum (v = 2) and Kepler’s third law 
(v = —1). 


PROBLEM. If the radius of a planet is a times the radius of the earth and its 
mass f times that of the earth, find the ratio of the acceleration of the force 
of gravity and the first and second cosmic velocities to the corresponding 
quantities for the earth. 


ANSWER. 7 = Ba ?, 6 = ./B/a. 


For the moon, for example, « = 1/3.7 and 8 = 1/81. Therefore, the accel- 
eration of gravity is about 1/6 that of the earth (y ~ 1/6), and the cosmic 
velocities are about 1/5 those for the earth (6 = 1/4.7). 


PROBLEM.?> A desert animal has to cover great distances between sources of 
water. How does the maximal time the animal can run depend on the size 
L of the animal? 


ANSwer. It is directly proportional to L. 


Solution. The store of water is proportional to the volume of the body, 
i.e., L3; the evaporation is proportional to the surface area, i.e., L?. Therefore, 
the maximal time of a run from one source to another is directly proportional 
to L. 

We notice that the maximal distance an animal can run also grows 
proportionally to L (cf. the following problem). 


PROBLEM. ** How does the running velocity of an animal on level ground 
and uphill depend on the size L of the animal? 


ANSweR. On level ground ~ L®, uphill ~ L7!. 


23 JM. Smith, Mathematical Ideas in Biology, Cambridge University Press, 1968. 
24 Ibid. 


2: Investigation of the equations of motion 


Solution. The power developed by the animal is proportional to L? 
(the percentage used by muscle is constant at about 25 %, the other 75% of 
the chemical energy is converted to heat; the heat output is proportional 
to the body surface, i.e., L?, which means that the effective power is pro- 
portional to L?). 

The force of air resistance is directly proportional to the square of the 
velocity and the area of a cross-section; the power spent on overcoming 
it is therefore proportional to v?L’v. Therefore, v?>L? ~ L?, so v ~ L®. In 
fact, the running velocity on level ground, no smaller for a rabbit than for 
a horse, in practice does not specifically depend on the size. 

The power necessary to run uphill is mgv ~ L+v; since the generated power 
is ~ L?, we find that v ~ L~!. In fact, a dog easily runs up a hill, while a 
horse slows its pace. 


PROBLEM. *** How does the height of an animal’s jump depend on its size? 
ANSWER. ~ L°®. 


Solution. For a jump of height h one needs energy proportional to L*/, 
and the work accomplished by muscular strength F is proportional to FL. 
The force F is proportional to L? (since the strength of bones is proportional 
to their section). Therefore, L*h ~ L?L, ie., the height of a jump does not 
depend on the size of the animal. In fact, a jerboa and a kangaroo can jump 
to approximately the same height. 


244 Ibid. 
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PART I 
LAGRANGIAN MECHANICS 


Lagrangian mechanics describes motion in a mechanical system by means of 
the configuration space. The configuration space of a mechanical system has 
the structure of a differentiable manifold, on which its group of diffeo- 
morphisms acts. The basic ideas and theorems of lagrangian mechanics are 
invariant under this group,”> even if formulated in terms of local coordinates. 

A lagrangian mechanical system is given by a manifold (“configuration 
space”) and a function on its tangent bundle (“the lagrangian function”). 

Every one-parameter group of diffeomorphisms of configuration space 
which fixes the lagrangian function defines a conservation law (i.e., a first 
integral of the equations of motion). 

A newtonian potential system is a particular case of a lagrangian system 
(the configuration space in this case is euclidean, and the lagrangian function 
is the difference between the kinetic and potential energies). 

The lagrangian point of view allows us to solve completely a series of 
important mechanical problems, including problems in the theory of small 
oscillations and in the dynamics of a rigid body. 


25 And even under larger groups of transformations, which also affect time. 


Variational principles 


In this chapter we show that the motions of a newtonian potential system 
are extremals of a variational principle, “Hamilton’s principle of least 
action.” 

This fact has many important consequences, including a quick method 
for writing equations of motion in curvilinear coordinate systems, and a 
series of qualitative deductions—for example, a theorem on returning to a 
neighborhood of the initial point. 

In this chapter we will use an n-dimensional coordinate space. A vector 
in such a space is a set of numbers x = (x,,..., x,). Similarly, 0f/Ox means 
(6f/0x,,..., Of/0x,), and (a, b) = a,b; + --- + a,b,. 


12 Calculus of variations 


For what follows, we will need some facts from the calculus of variations. A more detailed 
exposition can be found in “A Course in the Calculus of Variations” by M. A. Lavrentiev and 
L. A. Lusternik, M. L., 1938, or G. E. Shilov, “Elementary Functional Analysis,” MIT Press, 
1974, 

The calculus of variations is concerned with the extremals of functions 
whose domain is an infinite-dimensional space: the space of curves. Such 
functions are called functionals. 

An example of a functional is the length of a curve in the euclidean plane: 
if y = {(t, x): x(t) = x, to <t < ty}, then O(y) = fi) \/1 + x? dt. 

In general, a functional is any mapping from the space of curves to the 
real numbers. 

We consider an “approximation” y’ to y, y’ = {(t, x): x = x(t) + h(t}. 
We will call it y’ = y + h. Consider the increment of ®, ®(y + h) — ®(y) 
(Figure 41). 
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3: Variational principles 


to ty) 
Figure 41 Variation of a curve 


A Variations 


Definition. A functional ® is called differentiable*® if ®(y + h) — O(y) = 
F + R, where F depends linearly on h (ie., for a fixed y, F(hy + h2) = 
F(h,) + F(h,) and F(ch) = cF(h)), and R(h, y) = Oh) in the sense that, 
for |h| < e and |dh/dt| < «, we have |R| < Ce”. The linear part of the 
increment, F(h), is called the differential. 


It can be shown that if ® is differentiable, its differential is uniquely 
defined. The differential of a functional is also called its variation, and h is 
called a variation of the curve. 


EXAMPLE. Let y = {(t, x): x = x(t), to < t < t,} be acurve in the (¢, x)-plane; 
x = dx/dt; L = L(a, b,c) a differentiable function of three variables. We 
define a functional ® by 


a) = | " Lex(t), 0), dt 


0 


In case L = ./1 + b?, we get the length of y. 


Theorem. The functional ®(y) = Ji} L(x, X, t)dt is differentiable, and its 
derivative is given by the formula 


"“loL doéL OL 


O(y + h) — Wy) = [ eae +h, x +h, t) — L(x, X, pdt 


ty 


to 


PROOF. 


OAL OL, 2 
7 i [seh + lar + OG?) = FH) + R 


26 We should specify the class of curves on which ® is defined and the linear space which con- 
tains h. One could assume, for example, that both spaces consist of the infinitely differentiable 
functions. 
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12: Calculus of variations 


where 


F(h) = 


to 


oh, + oi dt and R= O(h’). 
Ox Ox 


Integrating by parts, we find that 
“OL. "nd (OL OL 
[ ay tat = - [ns (Fe) + (» =) 


Definition. An extremal of a differentiable functional ®(y) is a curve y such that 
F(h) = 0 for all h. 
(In exactly the same way that y is a stationary point of a function if the 
differential is equal to zero at that point.) 


. oO 


to 


B Extremals 


Theorem. The curve y:x = x(t) is an extremal of the functional ®(y) = 
ti L(x, x, t)dt on the space of curves passing through the points x(t) = Xo 
and x(t,) = X,, if and only if 


d (oL\ OL 
a (3) ras 0 along the curve x(t). 


Lemma. If a continuous function f(t), to <t < t, satisfies \ii f(t)h(t)dt = 0 
for any continuous?’ function h(t) with h(t) = h(t,) = 0, then f(t) = 0. 


h 


t*—d is t*+d 
to t 


Figure 42 Construction of the function h 


PROOF OF THE LEMMA. Let f(t*) > 0 for some t*, to < t* < t;. Since f is 
continuous, f(t) > c in some neighborhood A of the point t*: to < t* — 
d<t<t*+d < ty. Let h(t) be such that h(t) = 0 outside A, h(t) > 0 in A, 
and h(t) = 1 in A/2 (ie., for t st. t* — 4d <t < t* + 4d). Then, clearly, 
fis f@h(t) = dc > 0 (Figure 42). This contradiction shows that f(t*) = 0 
for all t*, ty < t* < ty. O 


PROOF OF THE THEOREM. By the preceding theorem, 
"1d (dL OL OL 
F(h) etna fh E (5) - Shh dt + & h) 


27 Or even for any infinitely differentiable function h. 


ty 


to 
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3: Variational principles 


The term after the integral is equal to zero since h(t) = h(t,) = 0. If y is an 
extremal, then F(h) = 0 for all h with h(t) = h(t,) = 0. Therefore, 


[ seomoae = 0, 


d (OL\ OL 
r= 4 (Z) mere 


for all such h. By the lemma, f(t) = 0. Conversely, if f(t) = 0, then clearly 


where 


F(h) = 0. O 
EXAMPLE. We verify that the extremals of length are straight lines. We have: 
OL OL x d x 
L=fl+% ~=0 w= —-——=]= 
ae ox 0x 1+ x? i: /1 + a) , 


x 
——= =c xX=c X= Cyl + Cp. 
1 es e2 1 1 2 
C The Euler-Lagrange equation 


Definition. The equation 


is called the Euler-Lagrange equation for the functional 


ty 
O= i L(x, X, t)dt. 
to 


Now let x be a vector in the n-dimensional coordinate space R", y = 
{(t,x):x =x(t),to <t<t,} a curve in the (n+ 1)-dimensional space 
R x R", and L: R" x R" x R—- Ra function of 2n + 1 variables. As before, 
we show: 


Theorem. The curve y is an extremal of the functional ®(y) = Jj} L(x, x, t)dt 
on the space of curves joining (to, Xo) and (t,, x,), if and only if the Euler- 
Lagrange equation is satisfied along y. 


This is a system of n second-order equations, and the solution depends on 
2n arbitrary constants. The 2n conditions x(to) = Xo, x(t,) = x, are used 
for finding them. 


PROBLEM. Cite examples where there are many extremals connecting two 
given points, and others where there are none at all. 


58 


13: Lagrange’s equations 


D An important remark 
The condition for a curve y to be an extremal of a functional does not depend 
on the choice of coordinate system. 

For example, the same functional—length of a curve—is given in cartesian 
and polar coordinates by the different formulas 


ty ty 
Peon = i) V Xt ae X5 dt Doo1 = JP + 1 @? dt. 
to to 


The extremals are the same—straight lines in the plane. The equations of 
lines in cartesian and polar coordinates are given by different functions: 
X1 = X,(t), x2 = x2(t), andr = f(t), p = ¢(t). 

However, both these vector functions satisfy the Euler-Lagrange 
equation 


only, in the first case, when Xcanp = X1,X2 and Loa, = ./X? + X3, and in 
the second case when x,.) = r, 9 and Ly = \/?* + r7@?. 

In this way we can easily describe in any coordinates a differential equa- 
tion for the family of all straight lines. 


PROBLEM. Find the differential equation for the family of all straight lines 
in the plane in polar coordinates. 


13 Lagrange’s equations 
Here we indicate the variational principle whose extremals are solutions of Newton’s equations 


of motion in a potential system. 
We compare Newton’s equations of dynamics 


d 0U 
1 — (mF) +— =0 
(1) a emt) + ar, 
with the Euler-Lagrange equation 

doL OL | 

dtox ox 


A Hamilton’s principle of least action 


Theorem. Motions of the mechanical system (1) coincide with extremals of 
the functional 


ty 
Dy) = [ Lat, whereL = T—U 
to 


is the difference between the kinetic and potential energy. 
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3: Variational principles 


Proor. Since U = U(r) and T = )' m;, t?/2, we have @L/dr; = 0T/d¥; = m;k; 
and 0L/ér,; = —0U/odr;. | 


Corollary. Let (q;,...,43,) be any coordinates in the configuration space of 
a system of n mass points. Then the evolution of q with time is subject to the 
Euler-Lagrange equations 


@ uh clas where L = T — U. 
dt \ oq 
Proor. By the theorem above, a motion is an extremal of the functional 


J L dt. Therefore, in any system of coordinates the Euler-Lagrange equation 
written in that coordinate system is satisfied. 


Definition. In mechanics we use the following terminology: L(q, q, t)= T— U 
is the Lagrange function or lagrangian, q; are the generalized coordinates, 
qj are generalized velocities, OL/0q; = p; are generalized momenta, 
OL/0q; are generalized forces, fi L@q, 4, t)dt is the action, (d(0L/04q;)/dt) 
—(@L/0q;) = 0 are Lagrange’s equations. 


The last theorem is called “Hamilton’s form of the principle of least 


motion” because in many cases the action q(t) is not only an extremal but 


is also a minimum value of the action functional { L dt. 


B The simplest examples 


EXAMPLE 1. For a free mass point in E°, 


in cartesian coordinates q; = r; we find 
m_. . : 
L=5 (i + 4 + 4). 


Here the generalized velocities are the components of the velocity vector, 
the generalized momenta p; = mq; are the components of the momentum 
vector, and Lagrange’s equations coincide with Newton’s equations 
dp/dt = 0. The extremals are straight lines. It follows from Hamilton’s 
principle that straight lines are not only shortest (i.e., extremals of the length 


1 /q? + q3 + G3 dt) but also extremals of the action {qi + 43 + 43)dt. 


PROBLEM. Show that this extremum is a minimum. 


EXAMPLE 2. We consider planar motion in a central field in polar coordinates 
41 = 1,42 = 9. From the relation t = re, + @re, we find the kinetic energy 
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14: Legendre transformations 


T = 4mi? = 4m(7? + r?@7) and the lagrangian L(q, q) = T(q, 4) — U(q), 
where U = U(q;). 
The generalized momenta will be p = 0L/04, i-e., 


Pp= mip, = mr*g. 
The first Lagrange equation p, = 0L/0q, takes the form 


. .2 OU 
mr = mrg* — —. 
or 
We already obtained this equation in Section 8. 

Since q, = ¢ does not enter into L, we have 0L/éq, = 0. Therefore, the 
second Lagrange equation will be p, = 0, p, = const. This is the law of 
conservation of angular momentum. 

In general, when the field is not central (U = U(r, @)), we find p, = 
— 0dU/dg. 

This equation can be rewritten in the form d(M, e,)/dt = N, where 
N = ({r, F],e,) and F = —0U/ér. (The rate of change in angular momentum 
relative to the z axis is equal to the moment of the force F relative to the 
z axis.) 

In fact, we have dU = (0U/dr)dr + (6U/éo)do = —(F, dr) = —(F, e,)dr — 
r(F, e,)dg; therefore, —-0U/dg = r(F, e,) = r([e,, F], e,) = (ft, F], e,). 

This example suggests the following generalization of the law of con- 
servation of angular momentum. 


Definition. A coordinate q; is called cyclic if it does not enter into the 
lagrangian: 0L/0dq; = 0. 


Theorem. The generalized momentum corresponding to a cyclic coordinate is 
conserved: p; = const. 


Proor. By Lagrange’s equation dp,/dt = 0L/0q; = 0. O 


14 Legendre transformations 


The Legendre transformation is a very useful mathematical tool: it transforms functions on a 
vector space to functions on the dual space. Legendre transformations are related to projective 
duality and tangential coordinates in algebraic geometry and the construction of dual Banach 
spaces in analysis. They are often encountered in physics (for example, in the definition of 
thermodynamic quantities). 


A Definition 


Let y = f(x) be a convex function, f”(x) > 0. 

The Legendre transformation of the function f is a new function g of a 
new variable p, which is constructed in the following way (Figure 43). We 
draw the graph of f in the x, y plane. Let p be a given number. Consider the 
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f(x) 


g(p) 


x 


x(p) 


Figure 43 Legendre transformation 


Straight line y = px. We take the point x = x(p) at which the curve is farthest 
from the straight line in the vertical direction: for each p the function px — 
f(x) = F(p, x) has a maximum with respect to x at the point x(p). Now we 


define g(p) = F(p, x(p)). 
The point x(p) is defined by the extremal condition @F/dx = 0, i.e., 
f'(x) = p. Since f is convex, the point x(p) is unique.?® 


PROBLEM . Show that the domain of g can be a point, a closed interval, or a ray if f is defined 


on the whole x axis. Prove that if f is defined on a closed interval, then g is defined on the whole p 
axis. 


B Examples 
EXAMPLE 1. Let f(x) = x”. Then F(p, x) = px — x?, x(p) = 4p, g(p) = 4p. 
EXAMPLE 2. Let f(x) = mx?/2. Then g(p) = p2/2m. 


EXAMPLE 3. Let f(x) = x*/a. Then g(p) = p*/B, where (1/x) + (1/B) = 1 
(a > 1,8 > 1). 


x p 
1 
Figure 44 Legendre transformation taking an angle to a line segment 


EXAMPLE 4. Let f(x) be a convex polygon. Then g(p) is also a convex polygon, 
in which the vertices of f(x) correspond to the edges of g(p), and the edges of 
f(x) to the vertices of g(p). For example, the corner depicted in Figure 44 is 
transformed to a segment under the Legendre transformation. 


28 Tf it exists. 
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C Involutivity 


Let us consider a function f which is differentiable as many times as necessary, 
with f"(x) > 0. It is easy to verify that a Legendre transformation takes 
convex functions to convex functions. Therefore, we can apply it twice. 


Theorem. The Legendre transformation is involutive, i.e., its square is the 
identity: if under the Legendre transformation f is taken to g, then the 
Legendre transform of g will again be f. 


Proor. In order to apply the Legendre transform to g, with variable p, we 
must by definition look at a new independent variable (which we will call x), 
construct the function 


G(x, p) = xp — g(p), 


and find the point p(x) at which G attains its maximum: @G/dp = 0, ic., 
g'(p) = x. Then the Legendre transform of g(p) will be the function of x 
equal to G(x, p(x)). 

We will show that G(x, p(x)) = f(x). To this end we notice that G(x, p) = 
xp — g(p) has a simple geometric interpretation: it is the ordinate of the 
point with abscissa x on the line tangent to the graph of f(x) with slope p 


f(x) 


Xo x (p) 


Figure 45 Involutivity of the Legendre transformation 


(Figure 45). For fixed p, the function G(x, p) is a linear function of x, with 
dG/dx = p, and for x = x(p) we have G(x, p) = xp — g(p) = f(x) by the 
definition of g(p). 

Let us now fix x = x and vary p. Then the values of G(x, p) will be the 
ordinates of the points of intersection of the line x = x9 with the line tangent 
to the graph of f(x) with various slopes p. By the convexity of the graph it 
follows that all these tangents lie below the curve, and therefore the maximum 
of G(x, p) for a fixed x(pg) is equal to f(x) (and is achieved for p = p(xo) = 
f'(Xo)). O) 
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gp) 


SQ) 


x 


Figure 46 Legendre transformation of a quadratic form 


Corollary.”° Consider a given family of straight lines y = px — g(p). Then 
its envelope has the equation y = f(x), where f is the Legendre transform 


of g. 
D Young’s inequality 


Definition. Two functions, f and g, which are the Legendre transforms of 
one another are called dual in the sense of Young. 


By definition of the Legendre transform, F(x, p) = px — f(x) is less 
than or equal to g(p) for any x and p. From this we have Young’s inequality: 


px < f(x) + g(p). 


EXAMPLE 1. If f(x) = 4x?, then g(p) = 3p” and we obtain the well-known 
inequality px < 4x? + $p? for all x and p. 


EXAMPLE 2. If f(x) = x*/a, then g(p) = p*/B, where (1/a) + (1/8) = 1, and 
we obtain Young’s inequality px < (x*/«) + (p'/B) for all x > 0, p> 0, 
a> 1,B > 1, and (1/a) + (1/f) = 1. 


E The case of many variables 


Now let f(x) be a convex function of the vector variable x = (x,, ..., Xn) 
(i.e., the quadratic form ((0?f/0x)dx, dx) is positive definite). Then the 
Legendre transform is the function g(p) of the vector variable p = (p1,..., Pn); 


defined as above by the equalities g(p) = F(p, x(p)) = max, F(p, x), where 
F(p, x) = (p, x) — f(x) and p = 0//0x. 

All of the above arguments, including Young’s inequality, can be carried 
over without change to this case. 


PROBLEM. Let f: R” > R be a convex function. Let R"* denote the dual vector 
space. Show that the formulas above completely define the mapping 
g: R"* > R (under the condition that the linear form df |, ranges over all of 
R"™ when x ranges over R"). 


29 One can easily see that this is the theory of “Clairaut’s equation.” 
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PROBLEM. Let f be the quadratic form f(x) = )' f,,;x;x;. Show that its 
Legendre transform is again a quadratic form g(p) = )_ gi;p;p;, and that the 
values of both forms at corresponding points coincide (Figure 46): 


S(x(P)) = g(p) and g(p(x)) = f(x). 


15 Hamilton’s equations 


By means of a Legendre transformation, a lagrangian system of second-order differential 
equations is converted into a remarkably symmetrical system of 2n first-order equations called 
a hamiltonian system of equations (or canonical equations). 


A Equivalence of Lagrange’s and Hamilton’s 
equations 


We consider the system of Lagrange’s equations p = 0L/dq, where p = 


OL/6q, with a given lagrangian function L: R" x R" x R- R, which we will 
assume to be convex?° with respect to the second argument q. 


Theorem. The system of Lagrange’s equations is equivalent to the system of 
2n first-order equations (Hamilton’s equations) 


where H(p, q, t) = pq — L(q, 4, t) is the Legendre transform of the lagrang- 

ian function viewed as a function of 4. 
Proor. By definition, the Legendre transform of L(q, q, t) with respect to q 
is the function H(p) = pq — L(q), in which q is expressed in terms of p 
by the formula p = 0L/0q, and which depends on the parameters q and t. 
This function H is called the hamiltonian. 

The total differential of the hamiltonian 


oH oH oH 


is equal to the total differential of pq — L for p = 0L/0q: 


. OL 
lena sana 


Both expressions for dH must be the same. Therefore, 


ee 
= & aqté‘éGKQ att*«é«iES 


3° In practice this convex function will often be a positive definite quadratic form. 
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Applying Lagrange’s equations p = 0L/dq, we obtain Hamilton’s equa- 
tions. 

We have seen that, if q(t) satisfies Lagrange’s equations, then (p(t), q(t)) 
satisfies Hamilton’s equations. The converse is proved in an analogous 
manner. Therefore, the systems of Lagrange and Hamilton are equivalent. 


O 


Remark. The theorem just proved applies to all variational problems, not 
just to the lagrangian equations of mechanics. 


B Hamilton’s function and energy 


EXAMPLE. Suppose now that the equations are mechanical, so that the 
lagrangian has the usual form L = T — U, where the kinetic energy T is a 
quadratic form with respect to q: 


T =4) 4;;4:4;, where a;; = a;{q, t) and U = U@). 


Theorem. Under the given assumptions, the hamiltonian H is the total energy 
H=T+U. 


The proof is based on the following lemma on the Legendre transform of 
a quadratic form. 


Lemma. The values of a quadratic form f(x) and of its Legendre transform 
g(P) coincide at corresponding points: f(x) = g(p). 


EXAMPLE. For the form f(x) = x? this is a well-known property of a tangent 
to a parabola. For the form f(x) = 4mx? we have p = mx and g(p) = 
p?/2m = mx?/2 = f(x). 


PROOF OF THE LEMMA By Euler’s theorem on homogeneous functions 
(6f/0x)x = 2f. Therefore, g(p(x)) = px — f(x) = (6f/0x)x — f = 2f(x) — 
f(x) = F(X). O 


PROOF OF THE THEOREM. Reasoning as in the lemma, we find that H = pq — 
L=2T-(T-U)=T+U. O 


EXAMPLE. For one-dimensional motion 


In this case T = 447, U = U(q), p = 4, H = tp” + U(q) and Hamilton’s 
equations take the form 
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This example makes it easy to remember which of Hamilton’s equations 
has a minus sign. 

Several important corollaries follow from the theorem on the equivalence 
of the equations of motion to a hamiltonian system. For example, the law of 
conservation of energy takes the simple form: 


Corollary 1. dH/dt = 0H/ét. In particular, for a system whose hamiltonian 
function does not depend explicitly on time (GH/dt = 0), the law of conserva- 
tion of the hamiltonian function holds: H(p(t), q(t)) = const. 


Proor. We consider the variation in H along the trajectory H(p(t), q(t), t). 


Then, by Hamilton’s equations, 
aH _0H (_@H) | aH aH | oH _ OH 7 
dt Op q éq op ot at 


C Cyclic coordinates 


When considering central fields, we noticed that a problem could be reduced 
to a one-dimensional problem by the introduction of polar coordinates. It 
turns out that, given any symmetry of a problem allowing us to choose a 
system of coordinates q in such a way that the hamiltonian function is 
independent of some of the coordinates, we can find some first integrals and 
thereby reduce to a problem in a smaller number of coordinates. 


Definition. If a coordinate q, does not enter into the hamiltonian function 
(Dy, Dos +++s Pai Q15+++5 Init), Le., OH/0q, = 0, then it is called cyclic 
(the term comes from the particular case of the angular coordinate in a 
central field). 


Clearly, the coordinate q, is cyclic if and only if it does not enter into the 
lagrangian function (@L/0q, = 0). It follows from the hamiltonian form of 
the equations of motion that: 


Corollary 2. Let q,; be a cyclic coordinate. Then p, is a first integral. In this 
case the variation of the remaining coordinates with time is the same as ina 
system with then — 1 independent coordinates q,,..., q, and with hamilton- 
ian function 

FD 330053 Das Q3s.0s09'Gns be Os 
depending on the parameter c = py. 


PRoor. We set p’ = (p2,..., p,) and q’ = (q2,...,q,). Then Hamilton’s 
equations take the form 


d_, OH d oH 
at’ ~ dp at" ~ Gp, 
d_, oH d 

ae ~~ aq arise 
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The last equation shows that p, = const. Therefore, in the system of equations 
for p’ and q’, the value of p, enters only as a parameter in the hamiltonian 
function. After this system of 2n — 2 equations is solved, the equation for q, 
takes the form 


d 0 / , 
qa Py NBEO) = ope H(P,, P'(), ¢(), 1) 
and is easily integrated. O 


Almost all the solved problems in mechanics have been solved by means 
of Corollary 2. 


Corollary 3. Every closed system with two degrees of freedom (n = 2) which has 
a cyclic coordinate is integrable. 


Proor. In this case the system for p’ and q’ is one-dimensional and is im- 
mediately integrated by means of the integral H(p’, q’) = c. O 


16 Liouville’s theorem 


The phase flow of Hamilton’s equations preserves phase volume. It follows, for example, that a 
hamiltonian system cannot be asymptotically stable. 


For simplicity we look at the case in which the hamiltonian function does 
not depend explicitly on the time: H = H(p, q). 


A The phase flow 
Definition. The 2n-dimensional space with coordinates py, ..., Pai is +++> In 
is called phase space. 


EXAMPLE. In the case n = 1 this is the phase plane of the system X = — 0U/0x, 
which we considered in Section 4. 


Just as in this simplest example, the right-hand sides of Hamilton’s 
equations give a vector field: at each point (p, q) of phase space there is a 
2n-dimensional vector (—0H/0q, GH/dp). We assume that every solution of 
Hamilton’s equations can be extended to the whole time axis.3! 


Definition. The phase flow is the one-parameter group of transformations 
of phase space 


g': (p(0), 4(0)) + (PCO), 4(0)), 


where p(t) and q(t) are solutions of Hamilton’s system of equations 
(Figure 47). 


Pros_eom. Show that {g’} is a group. 


31 For this it is sufficient, for example, that the level sets of H be compact. 
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t 


g 
(p(t), a(t) 


(p(0), 4(0)) 


Figure 47 Phase flow 


B Liouville’s theorem 


Theorem 1. The phase flow preserves volume: for any region D we have (Figure 


48) 
volume of g'D = volume of D. 


We will prove the following slightly more general proposition also 


due to Liouville. 
Ox 
0 06°C 


Figure 48 Conservation of volume 


Suppose we are given a system of ordinary differential equations 
x = f(x), x = (x;,..., X,), Whose solution may be extended to the whole 
time axis. Let {g'} be the corresponding group of transformations: 


(1) g(x) = x + f(x)t + O(7), (t > 0). 
Let D(0) be a region in x-space and v(0) its volume; 


v(t) = volume of D(t) D(t) = g‘D(0). 
Theorem 2. If div f = 0, then g' preserves volume: v(t) = v(0). 


C Proof 
Lemma 1. (dv/dt)|,-0 = Jpg) divf dx (dx = dx, --- dx,). 
Proor. For any t, the formula for changing variables in a multiple integral 
gives 
dg'x 
ut) = | det a ax. 
Calculating dg'x/0x by formula (1), we find 


0g'x of 
=E+ — 2 > 0. 
ax + et + OF) ast—0 
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We will now use a well-known algebraic fact: 


Lemma 2. For any matrix A = (a;;), 


det(E + Ath=1+ttrA+O(t?),, t-0, 


where tr A = ae a,; is the trace of A (the sum of the diagonal elements). 


(The proof of Lemma 2 is obtained by a direct expansion of the deter- 
minant: we get 1 and n terms in t; the remaining terms involve 7, t?, etc.) 
Using this, we have 


t 
OO 1+ is ORY 


de ox ox 


But tr Of/Ox = S'7_, Of,/0x; = div f. Therefore, 


v(t) = [1 + tdivf + O(t?)]dx, 


D(O) 


which proves Lemma 1. O 


PROOF OF THEOREM 2, Since t = tg is no worse than t = 0, Lemma | can be 
written in the form 
dv(t) 
dt 


= | div f dx, 
t=to D(to) 


and if divf = 0, dv/dt = 0. O 


In particular, for Hamilton’s equations we have 


0 oH 6 (cH 
ivf= — = 0. 
oe al a) + (5s) 


This proves Liouville’s theorem (Theorem 1). O 


ProsLeM. Prove Liouville’s formula W = Woe'"4“ for the Wronskian 
determinant of the linear system x = A(t)x. 


Liouville’s theorem has many applications. 
Pros_em. Show that in a hamiltonian system it is impossible to have 
asymptotically stable equilibrium positions and asymptotically stable limit 


cycles in the phase space. 


Liouville’s theorem has particularly important applications in statistical 
mechanics. 
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Liouville’s theorem allows one to apply methods of ergodic theory>* to 
the study of mechanics. We consider only the simplest example: 


D Poincaré’s recurrence theorem 


Let g be a volume-preserving continuous one-to-one mapping which maps 
a bounded region D of euclidean space onto itself: gD = D. 

Then in any neighborhood U of any point of D there is a point xe U 
which returns to U, i.e., g"x € U for some n > 0. 


U 


<P 


Nw 


x] 


Figure 49 The way a ball will move in an asymmetrical cup is unknown; however 
Poincaré’s theorem predicts that it will return to a neighborhood of the original position. 


This theorem applies, for example, to the phase flow g' of a two-dimen- 
sional system whose potential U(x,, x2) goes to infinity as (x, x.) > 0; in 
this case the invariant bounded region in phase space is given by the condition 
(Figure 49) 


D = {p,q:T+ U < E}. 


Poincaré’s theorem can be strengthened, showing that almost every 
moving point returns repeatedly to the vicinity of its initial position. This is 
one of the few general conclusions which can be drawn about the character 
of motion. The details of motion are not known at all, even in the case 


x=-—, where x = (xj, X2). 


The following prediction is a paradoxical conclusion from the theorems 
of Poincaré and Liouville: if you open a partition separating a chamber 
containing gas and a chamber with a vacuum, then after a while the gas 
molecules will again collect in the first chamber (Figure 50). 

The resolution of the paradox lies in the fact that “a while” may be longer 
than the duration of the solar system’s existence. 


>? Cf, for example, the book: Halmos, Lectures on Ergodic Theory, 1956 (Mathematical Society 
of Japan. Publications. No. 3). 
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Figure 50 Molecules return to the first chamber. 


Figure 51 Theorem on returning 


PROOF OF POINCARE’S THEOREM. We consider the images of the neighborhood 
U (Figure 51): 


U, gU, g?U,..., g"U,... 


All of these have the same volume. If they never intersected, D would have 
infinite volume. Therefore, for some k > 0 and / > 0, with k > 1, 


gtUng'U # ©. 
Therefore, g*'U AU # @. If y is in this intersection, then y = g"x, with 
xe U(n=k—1). Then x € U and g"x € U(n =k — I). oO 


E Applications of Poincaré’s theorem 


EXAMPLE 1. Let D be a circle and g rotation through an angle «. If a = 
2n(m/n), then g" is the identity, and the theorem is obvious. If « is not commen- 
surable with 2z, then Poincaré’s theorem gives 


Vd > 0, dn:|g"x — x| <6 (Figure 52). 


8x 


Figure 52 Dense set on the circle 
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It easily follows that 


Theorem. If « 4 2x(m/n), then the set of points g*x is dense? on the circle 
(eet 2): 


PRosBLEM. Show that every orbit of motion in a central field with U = r* is 
either closed or densely fills the ring between two circles. 


EXAMPLE 2. Let D be the two-dimensional torus and @, and @, angular 
coordinates on it (longitude and latitude) (Figure 53). 


¥2 


YI 


Figure $3 Torus 


Consider the system of ordinary differential equations on the torus 
Gr = % Pz = %. 
Clearly, div f = 0 and the corresponding motion 
9 (P1, P2) > (Pr + Mt, Pr + Ht) 


preserves the volume dg, dgy,. From Poincaré’s theorem it is easy to deduce 


Theorem. If «,/«, is irrational, then the “winding line” on the torus, g'(@1, P2), 
is dense in the torus. 


PRos_em. Show that if « is irrational, then the Lissajous figure (x = cos t, 
y = cos @t) is dense in the square |x| < 1, |y| < 1. 


EXAMPLE 3. Let D be the n-dimensional torus T", i.e., the direct product>* 


of n circles: 


D=S!xS'!x..-x S'=T" 
LY 


A point on the n-dimensional torus is given by n angular coordinates 
@ = (Q,,..., Q,). Let & = (a,,..., 4), and let g’ be the volume-preserving 
transformation 


g:T'>T gr-gt+at. 


33 A set A is dense in B if there is a point of A in every neighborhood of every point of B. 


34 The direct product of the sets A, B,... is the set of points (a, b,...), withae A,beB,.... 
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PROBLEM. Under which conditions on @ are the following sets dense : (a) the 
trajectory {g‘@}; (b) the trajectory {g*@} (t belongs to the group of real 
numbers R, k to the group of integers Z). 


The transformations in Examples 1 to 3 are closely connected to 
mechanics. But since Poincaré’s theorem is abstract, it also has applications 


unconnected with mechanics. 


EXAMPLE 4. Consider the first digits of the numbers 2": 1, 2, 4, 8, 1, 3, 6, 1, 2, 
5,1,2,4,.... 


PROBLEM. Does the digit 7 appear in this sequence? Which digit appears 
more often, 7 or 8? How many times more often? 
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Lagrangian mechanics on manifolds 


In this chapter we introduce the concepts of a differentiable manifold and 
its tangent bundle. A lagrangian function, given on the tangent bundle, 
defines a lagrangian “holonomic system” on a manifold. Systems of point 
masses with holonomic constraints (e.g., a pendulum or a rigid body) are 
special cases. 


17 Holonomic constraints 


In this paragraph we define the notion of a system of point masses with holonomic constraints. 


A Example 


Let y be a smooth curve in the plane. If there is a very strong force field in a 
neighborhood of y, directed towards the curve, then a moving point will 
always be close to y. In the limit case of an infinite force field, the point must 
remain on the curve y. In this case we say that a constraint is put on the 
system (Figure 54). 

To formulate this precisely, we introduce curvilinear coordinates q, and 
q2 on a neighborhood of 7; q, is in the direction of y and q, is distance from 
the curve. 

We consider the system with potential energy 


Uy = Nq3 + Uo(q:, 42), 


depending on the parameter N (which we will let tend to infinity) (Figure 55). 
We consider the initial conditions on y: 


q,(0) = qi 4100) = 4g q2(0) = 0 q2(0) = 0. 
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Figure 54 Constraint as an infinitely strong field 


U 


Figure 55 Potential energy Uy 


Denote by q, = ¢(t, N) the evolution of the coordinate g, under a motion 
with these initial conditions in the field Uy. 


Theorem. The following limit exists,as N > 0: 


lim g(t, N) = W(t). 


No 


The limit q, = W(t) satisfies Lagrange’s equation 

d(OL,\ OL, 

dt\0g,)  4q,’ 
where L,(41, 41) = Tlq.=4.=0 — Uolqs=0 (T is the kinetic energy of 
motion along y). 


Thus, as N — oo, Lagrange’s equations for q, and q, induce Lagrange’s 
equation for gq, = W(t). 

We obtain exactly the same result if we replace the plane by the 3n- 
dimensional configuration space of n points, consisting of a mechanical 
system with metric ds? = )?_, m; dr? (the m; are masses), replace the curve y 
by a submanifold of the 3n-dimensional space, replace q, by some coordinates 
q, ony, and replace q, by some coordinates q, in the directions perpendicular 
to y. If the potential energy has the form 


U = Ud(q:, 42) + Nq3, 


then as N > oo, a motion on y is defined by Lagrange’s equations with the 
lagrangian function 


Ly = Tle=a=0 — Vola=o- 
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B Definition of a system with constraints 


We will not prove the theorem above,?> but neither will we use it. We need 
it only to justify the following. 


Definition. Let y be an m-dimensional surface in the 3n-dimensional con- 
figuration space of the points r,,...,r, with masses m,,...,m,. Let 
q = (q1,--->Gm) be some coordinates on y:r; = 1r,q). The system 
described by the equations 


da _ a 
dt dq = dq 


is called a system of n points with 3n — m ideal holonomic constraints. 
The surface » is called the configuration space of the system with constraints. 

If the surface y is given by k = 3n — m functionally independent 
equations f,(r) = 0,..., f(r) = 0, then we say that the system is con- 
strained by the relations f,; = 0,..., f, = 0. 


L=3) mi? + U@) 


Holonomic constraints also could have been defined as the limiting case 
of a system with a large potential energy. The meaning of these constraints in 
mechanics lies in the experimentally determined fact that many mechanical 
systems belong to this class more or less exactly. 

From now on, for convenience, we will call ideal holonomic constraints 
simply constraints. Other constraints will not be considered in this book. 


18 Differentiable manifolds 


The configuration space of a system with constraints is a differentiable manifold. In this para- 
graph we give the elementary facts about differentiable manifolds. 


A Definition of a differentiable manifold 


A set M is given the structure of a differentiable manifold if M is provided 
with a finite or countable collection of charts, so that every point is represented 
in at least one chart. 

A chart is an open set U in the euclidean coordinate space q = (41, ---, Gn)s 
together with a one-to-one mapping g of U onto some subset of M, 
g:U>@U cM. 

We assume that if points p and p’ in two charts U and U’ have the same 
image in M, then p and p’ have neighborhoods V < U and V’ c U’ with the 
same image in M (Figure 56). In this way we get a mapping y’'9:V > V’. 

This is a mapping of the region V of the euclidean space q onto the region 
V’ of the euclidean space q’, and it is given by n functions of n variables, 


35 The proof is based on the fact that, due to the conservation of energy, a moving point cannot 
move further from y than cN~ ‘/?, which approaches zero as N -> 00. 
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Figure 56 Compatible charts 


q' = q'(q), (q = 4(q’)). The charts U and U’ are called compatible if these 
functions are differentiable.*° 

An atlas is a union of compatible charts. Two atlases are equivalent if 
their union is also an atlas. 

A differentiable manifold is a class of equivalent atlases. We will consider 
only connected manifolds.*” Then the number n will be the same for all 
charts; it is called the dimension of the manifold. 

A neighborhood of a point on a manifold is the image under a mapping 
o:U —> M ofaneighborhood of the representation of this point in a chart U. 
We will assume that every two different points have non-intersecting 
neighborhoods. 


B Examples 


EXAMPLE |. Euclidean space R” is a manifold, with an atlas consisting of one chart. 


EXAMPLE 2. The sphere S? = {(x, y, z):x? + y? + z? = 1} has the structure of a manifold, with 
atlas, for example, consisting of two charts (U;, @;, i = 1, 2) in stereographic projection (Figure 
57). An analogous construction applies to the n-sphere 


S" = {(xq,--25Xne1)i x? = I}. 


Figure 57 Atlas of a sphere 


EXAMPLE 3. Consider a planar pendulum. Its configuration space—the circle S'—is a manifold. 
The usual atlas is furnished by the angular coordinates gy: R! > S', U, = (—7, 2), U2 = (0, 22) 
(Figure 58). 


EXAMPLE 4. The configuration space of the “spherical” mathematical pendulum is the two- 
dimensional sphere S? (Figure 58). 


3¢ By differentiable here we mean r times continuously differentiable; the exact value of r 
(1 <r < o) is immaterial (we may take r = oc, for example). 


37 A manifold is connected if it cannot be divided into two disjoint open subsets. 
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Ds, 2 


Figure 58 Planar, spherical and double planar pendulums 


EXAMPLE 5. The configuration space of a “planar double pendulum” is the direct product of two 
circles, i.e., the two-torus T? = S! x S' (Figure 58). 


EXAMPLE 6. The configuration space of a spherical double pendulum is the direct product of 
two spheres, S? x S?. 


EXAMPLE 7. A rigid line segment in the (q;, q2)-plane has for its configuration space the mani- 
fold R? x S', with coordinates q;. q2, q3 (Figure 59). It is covered by two charts. 


q2 
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Figure 59 Configuration space of a segment in the plane 


ExamPLE 8. A rigid right triangle OAB moves around the vertex O. The position of the triangle 
is given by three numbers: the direction OA € S? is given by two numbers, and if OA is given, 
one can rotate OB € S' around the axis OA (Figure 60). 

Connected with the position of the triangle OAB is an orthogonal right-handed frame, 
e, = OA/|OA|,e, = OB/|OB|, e; = [e,, €2]. The correspondence is one-to-one: therefore the 
position of the triangle is given by an orthogonal three-by-three matrix with determinant 1. 


O 


Figure 60 Configuration space of a triangle 
The set of all three-by-three matrices is the nine-dimensional space R®. Six orthogonality 
conditions select out two three-dimensional connected manifolds of matrices with determinant 


+1 and —1. The rotations of three-space (determinant + 1) form a group, which we call SO(3). 
Therefore, the configuration space of the triangle OAB is SO(3). 


PROBLEM. Show that SO(3) is homeomorphic to three-dimensional real projective space. 
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Definition. The dimension of the configuration space is called the number of 
degrees of freedom. 


EXAMPLE 9. Consider a system of k rods in a closed chain with hinged joints. 
PRoBLEM. How many degrees of freedom does this system have? 


EXAMPLE |(). Embedded manifolds. We say that M is an embedded k-dimensional sub-manifold of 
euclidean space R" (Figure 61) if in a neighborhood U of every point x € M there aren — k func- 
tions f;: U > R, fo: U > R,...,f,-,: U > R such that the intersection of U with M is given by 
the equations f, = 0, ..., f,-, = 0, and the vectors gradf,,..., grad f,_, at x are linearly 
independent. 


Xn 


xX] 
Figure 61 Embedded submanifold 


It is easy to give M the structure of a manifold, i.e., coordinates in a neighborhood of x(how?). 
It can be shown that every manifold can be embedded in some euclidean space. In Example 8, 


SO(3) is a subset of R?. 


PROBLEM. Show that SO(3) is embedded in R®, and at the same time, that SO(3) is a manifold. 


C Tangent space 


If M is a k-dimensional manifold embedded in E”, then at every point x 
we have a k-dimensional tangent space TM,.. Namely, TM, is the orthogonal 
complement to {grad f;,..., grad f,_,} (Figure 62). The vectors of the 
tangent space TM, based at x are called tangent vectors to M at x. We can 
also define these vectors directly as velocity vectors of curves in M: 

. @(t) — @(0) 
pba ee ice 


x=li 
t-0 t 


where @(0) = x, @(t) € M. 


T™, 


Figure 62 Tangent space 
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The definition of tangent vectors can also be given in intrinsic terms, 
independent of the embedding of M into E”. 

We will call two curves x = @(t) and x = W(t) equivalent if p(0) = W(0) = x 
and lim,_., (@(t) — W(t))/t = 0 in some chart. Then this tangent relationship 
is true in any chart (prove this!). 


Definition. A tangent vector to a manifold M at the point x is an equivalence 
class of curves g(t), with @(O) = x. 

It is easy to define the operations of multiplication of a tangent vector 
by a number and addition of tangent vectors. The set of tangent vectors 
to M at x forms a vector space T M,,. This space is also called the tangent 
space to M at x. 


For embedded manifolds the definition above agrees with the previous 
definition. Its advantage lies in the fact that it also holds for abstract 
manifolds, not embedded anywhere. 


Definition. Let U be a chart of an atlas for M with coordinates q,,..., qn. 
Then the components of the tangent vector to the curve q = @(t) are the 
numbers ¢,,...,¢,, where €; = (dg;/dt)|,-9. 


D The tangent bundle 


The union of the tangent spaces to M at the various points, | ),e4 TM,, has 
a natural differentiable manifold structure, the dimension of which is twice 
the dimension of M. 

This manifold is called the tangent bundle of M and is denoted by TM. A 
point of TM is a vector &, tangent to M at some point x. Local coordinates 
on TM are constructed as follows. Let q;,...,q, be local coordinates on 
M, and ¢,,..., €, components of a tangent vector in this coordinate system. 
Then the 2n numbers (q;,.--, Gn» 615 ---» &,) give a local coordinate system 
on TM. One sometimes writes dq; for ¢;. 

The mapping p: TM — M which takes a tangent vector & to the point 
x € M at which the vector is tangent to M (§ € TM,.), is called the natural 
projection. The inverse image of a point x € M under the natural projection, 
p  ‘(x), is the tangent space TM,. This space is called the fiber of the tangent 
bundle over the point x. 


E Riemannian manifolds 


If M is a manifold embedded in euclidean space, then the metric on euclidean 
space allows us to measure the lengths of curves, angles between vectors, 
volumes, etc. All of these quantities are expressed by means of the lengths of 
tangent vectors, that is, by the positive-definite quadratic form given on 
every tangent space TM, (Figure 63): 


TM,>R §> 6,5). 
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Figure 63 Riemannian metric 


For example, the length of a curve on a manifold is expressed using this form as [(;') = 
fx /<dx, dx), or, if the curve is given parametrically, y: [to, t;] + M,t > x(t) € M, then 


IG) = fis <x, Rade. 


Definition. A differentiable manifold with a fixed positive-definite quadratic 
form <&, &> on every tangent space TM, is called a Riemannian manifold. 
The quadratic form is called the Riemannian metric. 


Remark. Let U be a chart of an atlas for M with coordinates q,,..., dn- 
Then a Riemannian metric is given by the formula 


n 
ds? = > a; (q)dq; dq; aij = Aji, 
i,j=t 
where dq; are the coordinates of a tangent vector. 
The functions a;(q) are assumed to be differentiable as many times as 
necessary. 


F The derivative map 


Let f: M > N be a mapping of a manifold M to a manifold N. f is called 
differentiable if in local coordinates on M and N it is given by differentiable 
functions. 


Definition. The derivative of a differentiable mapping f: M— N at a point 
x € M is the linear map of the tangent spaces 
fax: TM, > TN peo; 


which is given in the following way (Figure 64): 

Let ve TM,,. Consider a curve @: R > M with @(0) = x, and velocity 
vector (d@/dt)|,.9 = v. Then f,,¥ is the velocity vector of the curve 
fo@: RN, 


d 
fa = 7 |_ SOO) 
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M N 


| A | 


Figure 64 Derivative of a mapping 
PROBLEM. Show that the vector f,,v does not depend on the curve g, but only on the vector v. 


Pros_eM. Show that the map f,,: TM, > TN p,, is linear. 


PROBLEM. Let x = (x,,...,X,,) be coordinates in a neighborhood of x € M, and y = (j4,.--, Vn) 
be coordinates in a neighborhood of y € N. Let & be the set of components of the vector v, and 
n the set of components of the vector f,, Vv. Show that 


: oy; 
n=—& ie, n=y 2 


Ox ij Ox bi 


j 
Taking the union of the mappings f,, for all x, we get a mapping of the whole tangent 
bundle 


fyi: TM>TN  fyV =fyx¥ for ve TMy. 
PRoBLEM. Show that f, is a differentiable map. 


PROBLEM. Let f: M > N,g:N > K,andh = ge f{: M > K. Show that h, = gy» fy- 


19 Lagrangian dynamical systems 


In this paragraph we define lagrangian dynamical systems on manifolds. Systems with holonomic 
constraints are a particular case. 


A Definition of a lagrangian system 

Let M be a differentiable manifold, TM its tangent bundle, and L: TM > R 
a differentiable function. A map y: R > M is called a motion in the lagrangian 
system with configuration manifold M and lagrangian function L if y is an 
extremal of the functional 


@(y) = | ‘Lat, 


where ¥ is the velocity vector y(t)€ TM). 


ExampLe. Let M be a region in a coordinate space with coordinates q = (q;,...,q,). The 
lagrangian function L: TM > R may be written in the form of a function L(q, q) of the 2n 
coordinates. As we showed in Section 12, the evolution of coordinates of a point moving with 
time satisfies Lagrange’s equations. 
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Theorem. The evolution of the local coordinates q = (q1,---+4n) of a point Y(t) 
under motion in a lagrangian system on a manifold satisfies the Lagrange 
equations 


dL _ ab 
dt dq aq’ 
where L(q, 4) is the expression for the function L: TM — Rin the coordinates 
q and q on TM. 


We often encounter the following special case. 


B Natural systems 
Let M be a Riemannian manifold. The quadratic form on each tangent space, 


T = dv, Vv» ve TM,, 


is called the kinetic energy. A differentiable function U: M — R is called a 
potential energy. 


Definition. A lagrangian system on a Riemannian manifold is called natural 
if the lagrangian function is equal to the difference between kinetic and 
potential energies: L = T — U. 


EXAMPLE. Consider two mass points m, and m, joined by a line segment of length / in the 
(x, y)-plane. Then a configuration space of three dimensions 
M = R? x S' c R? x R? 


is defined in the four-dimensional configuration space R? x R? of two free points (x,, y,) and 
(x, ¥2) by the condition \/(x; — x2)? + (y, — yz)? = | (Figure 65). 


is 


Figure 65 Segment in the plane 


x 


There is a quadratic form on the tangent space to the four-dimensional space (x1, X2, V1, 2): 
m,(X7 + yi) + m(3 + 92). 


Our three-dimensional manifold, as it is embedded in the four-dimensional one, is provided with 
a Riemannian metric. The holonomic system thus obtained is called in mechanics a line segment 
of fixed length in the (x, y)-plane. The kinetic energy is given by the formula 
x2 4 2 x2 + ¥ 
an +m, 2. 
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C Systems with holonomic constraints 


In Section 17 we defined the notion of a system of point masses with holo- 
nomic constraints. We will now show that such a system is natural. 

Consider the configuration manifold M of a system with constraints as 
embedded in the 3n-dimensional configuration space of a system of free 
points. The metric on the 3n-dimensional space is given by the quadratic 
form )%_,m,¥?. The embedded Riemannian manifold M with potential 
energy U coincides with the system defined in Section 17 or with the limiting 
case of the system with potential U + Nq3, N > ©, which grows rapidly 
outside of M. 


D Procedure for solving problems with constraints 


1. Determine the configuration manifold and introduce coordinates 
41> +++5 4, (in a neighborhood of each of its points). 

2. Express the kinetic energy T = )\4m,f? as a quadratic form in the 
generalized velocities 


T= 4 3 4:(9)4:4;- 


3. Construct the lagrangian function L = T — U(q) and solve Lagrange’s 
equations. 


EXAMPLE. We consider the motion of a point mass of mass | on a surface of revolution in three- 
dimensional space. It can be shown that the orbits are geodesics on the surface. In cylindrical 
coordinates r, ~, z the surface is given (locally) in the form r = r(z) or z = z(r). The kinetic 
energy has the form (Figure 66) 


T= 400? + + 8?) = FH + 2? + (G7) 
in coordinates @ and z, and 
T= 402 + 7 +27) =F + PvP + G7] 
in coordinates r and @. (We have used the identity x? + sf? = 7 + r??.) 


The lagrangian function L is equal to T. In both coordinate systems ¢ is a cyclic coordinate. 
The corresponding momentum is preserved; p, = r?@ is nothing other than the z-component of 


Zz 


Figure 66 Surface of revolution 


85 


4: Lagrangian mechanics on manifolds 


angular momentum. Since the system has two degrees of freedom, knowing the cyclic coordinate 
q is sufficient for integrating the problem completely (cf. Corollary 3, Section 15). 

We can obtain more easily a clear picture of the orbits by reasoning slightly differently. 
Denote by « the angle of the orbit with a meridian. We have rg = |v| sin x, where | v| is the mag- 
nitude of the velocity vector (Figure 66). 

By the law of conservation of energy, H = L = T is preserved. Therefore, |v| = const, so 
the conservation law for p, takes the form 


r sin a = const 
(“Clairaut’s theorem”). 

This relationship shows that the motion takes place in the region |sin x| < l,ie,r > ro sin 2%. 
Furthermore, the inclination of the orbit from the meridian increases as the radius r decreases. 
When the radius reaches the smallest possible value, r = ro sin %, the orbit is reflected and 
returns to the region with larger r (Figure 67). 


r=rg sin a 


r=rg sin a 
Figure 67 Geodesics on a surface of revolution 


PROBLEM. Show that the geodesics on a convex surface of revolution are divided into three 
classes: meridians, closed curves, and geodesics dense in a ring r > c. 


PROBLEM. Study the behavior of geodesics on the surface of a torus ((r — R)? + 2? = p?). 


E Non-autonomous systems 


A lagrangian non-autonomous system differs from the autonomous systems, 
which we have been studying until now, by the additional dependence of the 
lagrangian function on time: 


L:TMxR-R L = L(q, 4, ¢). 


In particular, both the kinetic and potential energies can depend on time in a 
non-autonomous natural system: 


T:TMxR-R U:MxR-R T = Tq, 4, ¢) U = UG, t). 


A system of n mass points, constrained by holonomic constraints depen- 
dent on time, is defined with the help of a time-dependent submanifold of the 
configuration space of a free system. Such a manifold is given by a mapping 


i:Mx RE" ~— i(q,t) =x, 


which, for any fixed t € R, defines an embedding M > E>". The formula of 
section D remains true for non-autonomous systems. 
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Figure 68 Bead on a rotating circle 


EXAMPLE. Consider the motion of a bead along a vertical circle of radius r (Figure 68) which 
rotates with angular velocity w around the vertical axis passing through the center O of the 
circle. The manifold M is the circle. Let q be the angular coordinate on the circle, measured from 
the highest point. 

Let x, y, and z be cartesian coordinates in E? with origin O and vertical axis z. Let ¢ be the 
angle of the plane of the circle with the plane xOz. By hypothesis, g = wt. The mapping 
i:M x R= E} is given by the formula 

i(q, t) = (r sin q cos wt, r sin g sin wt, r cos q). 


From this formula (or, more simply, from an “infinitesimal right triangle”) we find that 
m 
T= 5 wr? sin? q + r?q?) U = mgr cos q. 


In this case the lagrangian function L = T — U turns out to be independent of t, although the 
constraint does depend on time. Furthermore, the lagrangian function turns out to be the same 
as in the one-dimensional system with kinetic energy 


M., 
Th = = q? M = mr’, 
and with potential energy 


; m 
V = Acosq — Bsin’q, A = mgr, B= 5 w?r?, 


The form of the phase portrait depends on the ratio between A and B. For 2B < A (ic. fora 
rotation of the circle slow enough that wr < g), the lowest position of the bead (q = 2) is 


V 


GEE Ee, 


Figure 69 Effective potential energy and phase plane of the bead 
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stable and the characteristics of the motion are generally the same as in the case of a mathematical 
pendulum (@ = 0). 

For 2B > A, ie., for sufficiently fast rotation of the circle, the lowest position of the bead 
becomes unstable; on the other hand, two stable positions of the bead appear on the circle, 
where cos g = — A/2B = —g/w?r. The behavior of the bead under all possible initial conditions 
is clear from the shape of the phase curves in the (q, q)-plane (Figure 69). 


20 E. Noether’s theorem 


Various laws of conservation (of momentum, angular momentum, etc.) are particular cases of 
one general theorem: to every one-parameter group of diffeomorphisms of the configuration 
manifold of a lagrangian system which preserves the lagrangian function, there corresponds a 
first integral of the equations of motion. 


A Formulation of the theorem 


Let M be a smooth manifold, L: TM > R a smooth function on its tangent 
bundle TM. Let h: M > M be a smooth map. 


Definition. A lagrangian system (M, L) admits the mapping h if for any tangent 
vector ve TM, 


L(h,V) = L(y). 


EXAMPLE. Let M = {(x,, X2,X3)}, L = (m/2)(X? + X3 + %3) — U(x2, x3). The system admits 
the translation h: (x,, x2, X3) > (x; + 5, X2, x3) along the x, axis and does not admit, generally 
speaking, translations along the x, axis. 


Noether’s theorem. If the system (M, L) admits the one-parameter group of 
diffeomorphisms h’: M + M, se R, then the lagrangian system of equations 
corresponding to L has a first integral I: TM > R. 

In local coordinates q on M the integral I is written in the form 


_ aL dhe@) 
fe oq ds s=0 


I(q, 4) 


B Proof 


First, let M = R" be coordinate space. Let : R > M, q = Q(t) be a solution 
to Lagrange’s equations. Since hi, preserves L, the translation of a solution, 
hSo@:R — M also satisfies Lagrange’s equations for any s.** 

We consider the mapping ®: R x R > R’, given byq = ®s, t) = h*(@()) 
(Figure 70). 

We will denote derivatives with respect to t by dots and with respect to s 
by primes. By hypothesis 

dL(®,®) AL aL 


pir Sink ole gee | — @' 
oe ae Oa 


38 The authors of several textbooks mistakenly assert that the converse is also true, 1.e., that if 
h’ takes solutions to solutions, then hi, preserves L. 
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q(t) 


q(s, t) = AS (q(t)) 


hq) 
n*(q) 
Figure 70 Noether’s theorem 


where the partial derivatives of L are taken at the point q = ®(s, t), q = 
W(s, t). 

As we stated above, the mapping ®|,- con: R > R" for any fixed s 
satisfies Lagrange’s equation 


0 | éL : OL : 
patie (ied = —(® ® . 
a aq (Ws, t), P(s, »| aq (Gs, t), P(s, t)) 
We introduce the notation F(s, t) = (@L/dq)(@(s, t), (s, t)) and substitute 
OF /ét for 0L/éq in (1). 
Writing q’ as dq’/dt, we get 


oan (ee ly 4 OEf4 g)\ 4 fee g)\ 2 q 
7 dt aq)" aq a’) at aq 7 ~ dt 


Remark. The first integral I = (0L/0q)q' is defined above using local 
coordinates q. It turns out that the value of I(v) does not depend on the choice 
of coordinate system q. 

In fact, J is the rate of change of L(v) when the vector v € TM, varies inside 
TM,, with velocity (d/ds)|,— 9 h*x. Therefore, I(v) is well defined as a function 
of the tangent vector ve TM,,. Noether’s theorem is proved in the same way 
when M is a manifold. 


C Examples 
ExaMPLeE 1. Consider a system of point masses with masses m;: 
x? 
U 
Lams U(x) Xj = Xjy€y + Xj2€2 + X33, 


constrained by the conditions f(x) = 0. We assume that the system admits 
translations along the e, axis: 


hS:x; > x; + se, for alli. 


In other words, the constraints admit motions of the system as a whole 
along the e, axis, and the potential energy does not change under these. 
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By Noether’s theorem we conclude: If a system admits translations along 
the e, axis, then the projection of its center of mass on the e, axis moves 
linearly and uniformly. 

In fact, (d/ds)|,-)h®x; = e,. According to the remark at the end of B, the 
quantity 


OL . 
Pade =) mixin 


is preserved, i.e., the first component P, of the momentum vector is pre- 
served. We showed this earlier for a system without constraints. 


EXAMPLE 2. If a system admits rotations around the e, axis, then the angular 
momentum with respect to this axis, 


M,= » (Lx;, m;X;]J, €1) 


is conserved. 
It is easy to verify that if h° is rotation around the e, axis by the angle s, 
then (d/ds)|,-.h°x; = [e,, x;], from which it follows that 


1= FF ei. xd = ¥ Om, [esx = ¥ (xi mkih ev 


U 


PROBLEM 1. Suppose that a particle moves in the field of the uniform helical line x = cos ¢, 
y = sin yg, z = cg. Find the law of conservation corresponding to this helical symmetry. 


ANSWER. In any system which admits helical motions leaving our helical line fixed, the quantity 
I = cP; + M; is conserved. 


PROBLEM 2. Suppose that a rigid body is moving under its own inertia. Show that its center of 
mass moves linearly and uniformly. If the.center of mass is at rest, then the angular momentum 
with respect to it is conserved. 


PROBLEM 3. What quantity is conserved under the motion of a heavy rigid body if it is fixed at 
some point 0? What if, in addition, the body is symmetric with respect to an axis passing 
through O? 


PRoBLEM 4. Extend Noether’s theorem to non-autonomous lagrangian systems. 
Hint. Let M, = M x R be the extended configuration space (the direct product of the 
configuration manifold M with the time axis R). 
Define a function L;: TM, > R by 
dt 
L—; 
dt 


i.e., in local coordinates q, t on M, we define it by the formula 


dq dt dq/dt \¢ 
L,{q, t, —, = Liq, ; ‘ 
(1 ie *) (« dijde” } de 


We apply Noether’s theorem to the lagrangian system (M,, L,). 
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If L, admits the transformations h°: M, > M,, we obtain a first integral [,: TM, > R. 
Since [ Ldt = { L, dt. this reduces to a first integral J: TM x R— R of the original system. 
If, in local coordinates (q, t)on M,, we have I, = 1,(q, t,dq/dt, dt/dt), then I(q, 4, t) = 1,(q,t, 4, 1). 

In particular, if L does not depend on time, L, admits translations along time, h*(q, t) = 
(q, t + s). The corresponding first integral J is the energy integral. 


21 D’Alembert’s principle 


We give here a new definition of a system of point masses with holonomic constraints and prove 
its equivalence to the definition given in Section 17. 


A Example 


Consider the holonomic system (M, L), where M is a surface in three- 
dimensional space {x}: 

L = 4$mx? — U(x). 
In mechanical terms, “the mass point x of mass m must remain on the smooth 
surface M.” 

Consider a motion of the point, x(t). If Newton’s equations mx + (0U/0x) 
= 0 were satisfied, then in the absence of external forces (U = 0) the tra- 
jectory would be a straight line and could not lie on the surface M. 

From the point of view of Newton, this indicates the presence of a new 
force “forcing the point to stay on the surface.” 

Definition. The quantity 
0U 


R= mx + = 


is called the constraint force (Figure 71). 
R 


x(t) 


E 


Figure 71 Constraint force 


If we take the constraint force R(t) into account, Newton’s equations are 
obviously satisfied: 


The physical meaning of the constraint force becomes clear if we consider our system with 
constraints as the limit of systems with potential energy U + NU, as N > 00, where U,(x) = 
p’(x, M). For large N the constraint potential NU, produces a rapidly changing force 
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F = —N 0U,/0x; when we pass to the limit (N > 2) the average value of the force F under 
oscillations of x near M is R. The force F is perpendicular to M. Therefore, the constraint 
force R is perpendicular to M: (R, §) = 0 for every tangent vector §. 


B Formulation of the D’ Alembert-Lagrange 
principle 


In mechanics, tangent vectors to the configuration manifold are called 
virtual variations. The D’Alembert-Lagrange principle states: 


. OU 
(mi + ax? s) =0 
for any virtual variation &, or stated differently, the work of the constraint force 
on any virtual variation is zero. 

For a system of points x; with masses m, the constraint forces R; are defined 
by R, = m,X; + (@U/dx;), and D’Alembert’s principle has the form >) (R;, &;) 
= 0, or ¥ ((m;X; + (6U/6x;), §;) = 0, ie., the sum of the works of the con- 
straint forces on any virtual variation {&;} € TM, is zero. 

Constraints with the property described above are called ideal. 


If we define a system with holonomic constraints as a limit as N + <x, then the D’Alembert - 
Lagrange principle becomes a theorem: its proof is sketched above for the simplest case. 

It is possible, however, to define an ideal holonomic constraint using the D’Alembert- 
Lagrange principle. In this way we have three definitions of holonomic systems with constraints: 


1. The limit of systems with potential energies U + NU, as N > x. 

2. A holonomic system (M, L), where M is a smooth submanifold of the configuration space 
of a system without constraints and L is the lagrangian. 

3. A system which complies with the D’Alembert- Lagrange principle. 


All three definitions are mathematically equivalent. 
The proof of the implications (1) = (2) and (1) = (3) is sketched above and will not be given 
in further detail. We will now show that (2) = (3). 


C The equivalence of the D’ Alembert-Lagrange 
principle and the variational principle 


Let M be a submanifold of euclidean space, M c R®, and x: R > Macurve, 
with x(to) = Xo, X(t,) = X;. 


Definition. The curve x is called a conditional extremal of the action functional 


ty 2 
® = [ ‘5 = wooha. 


if the differential 6® is equal to zero under the condition that the variation 
consists of nearby curves? joining Xo to x, in M. 


3 Strictly speaking, in order to define a variation 5®, one must define on the set of curves near x 
on M the structure of a region in a vector space. This can be done using coordinates on M; 
however, the property of being a conditional extremal does not depend on the choice of a co- 
ordinate system. 
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We will write 


(1) dy® = 0. 

Clearly, Equation (1) is equivalent to the Lagrange equations 
doL OL a 
rarer L=> — U(x) x = x(q), 


in some local coordinate system q on M. 


Theorem. A curve x:R > M c R* is a conditional extremal of the action 
(i.e., satisfies Equation (1)) if and only if it satisfies D’ Alembert’s equation 


mre 2.0} 
(2) (x + Ox” s) = 0, VEeTM,. 


Lemma. Let f: {t: tg < t < t,} > R™ be a continuous vector field. If, for every 
continuous tangent vector field §, tangent to M along x (i.e., E(t)€ TM yw, 
with &(t) = 0 for t = to, t,), we have 


[ toscoa = 


then the field f(t) is perpendicular to M at every point x(t) (.e., (f(t), h) = 0 
for every vector he TM,,«) (Figure 72). 


Figure 72 Lemma about the normal field 


The proof of the lemma repeats the argument which we used to derive the 
Euler-Lagrange equations in Section 12. 


PROOF OF THE THEOREM. We compare the value of ® on the two curves x(t) 
and x(t) + &(t), where E(t)) = &(t,) = 0. Integrating by parts, we obtain 


5@ = is (x - 8)a = -f (i+ eae 
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It is obvious from this formula*® that Equation (1), 64 = 0, is equivalent 
to the collection of equations 


(3) [. («+ Q)ee- 


for all tangent vector fields E(t)€ TM, with &(to) = &(t,) = 0. By the 
lemma (where we must set f = X + (6U/0x))the collection of equations (3) 
is equivalent to the D’Alembert-Lagrange equation (2). O 


D Remarks 


Remark 1. We derive the D’Alembert-Lagrange principle for a system of n 
points x;¢R°, i= 1, ..., n, with masses m;, with holonomic constraints, 
from the above theorem. 

In the coordinates x = {X; = J/m;x;}, the kinetic energy takes the form 
T =4)  m;x? = 4%’. 

By the theorem, the extremals of the principle of least action satisfy the 
condition 


(the D’Alembert-Lagrange principle for points in R*": the 3n-dimensional 
reaction force is orthogonal to the manifold M in the metric T). Returning 
to the coordinates x;, we get 


0=( mX; + + ee vm) = ¥ (ma, + 52.8), 


ie., the D’Alembert-Lagrange principle in the form indicated earlier: the 
sum of the work of the reaction forces on virtual variations is zero. 

Remark 2. The D’Alembert-Lagrange principle can be given in a slightly 
different form if we turn to statics. An equilibrium position is a point x) which 
is the orbit of a motion: x(t) = Xp. 

Suppose that a point mass moves along a smooth surface M under the 
influence of the force f = —dU/0x. 


Theorem. The point Xo in M is an equilibrium position if and only if the force 
is orthogonal to the surface at Xo: (£(Xo), §) = 0 for all § € TM,,,. 


This follows from the D’Alembert-Lagrange equations in view of the 
fact that X = 0. 


Definition. — mx is called the force of inertia. 
40 The distance of the points x(t) + &(t) from M is small of second-order compared with €(t). 


94 


21: D’Alembert’s principle 


Now the D’Alembert-Lagrange principle takes the form: 


Theorem. If the forces of inertia are added to the acting forces, x becomes an 
equilibrium position. 


ProorF. D’Alembert’s equation 


(—mx +f, §) =0 
expresses the fact, as in the preceding theorem, that x is an equilibrium 
position of a system with forces —mx + f. | 


Entirely analogous statements are true for systems of points: If x = {x,} 
are equilibrium positions, then the sum of the work of the forces acting on the 
virtual variations is equal to zero. If the forces of inertia —m;X,(t) are added 
to the acting forces, then the position x(t) becomes an equilibrium position. 

Now a problem about motions can be reduced to a problem about 
equilibrium under actions of other forces. 

Remark 3. Up to now we have not considered cases when the constraints 
depend on time. All that was said above carries over to such constraints 
without any changes. 


EXAMPLE. Consider a bead sliding along a rod which is tilted at an angle « 
to the vertical axis and is rotating uniformly with angular velocity w around 


N 


fe, 

Pr 
00 

Figure 73 Bead on a rotating rod 


this axis (its weight is negligible). For our coordinate q we take the distance 
from the point 0 (Figure 73). The kinetic energy and lagrangian are: 
L = T = 4mv? = 4mq@ + 4mw’r’, 
r=qsina. 
Lagrange’s equation: mg = mw7q sin? «. 


The constraint force at each moment is orthogonal to virtual variations 
(.e., to the direction of the rod), but is not at all orthogonal to the actual 
trajectory. 

Remark 4. It is easy to derive conservation laws from the D’Alembert- 
Lagrange equations. For example, if translation along the x, axis €; = e, is 
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among the virtual variations, then the sum of the work of the constraint forces 
on this variation is equal to zero: 


VR, e,) = (> R,,e,) = 0. 


If we now consider constraint forces as external forces, then we notice that the 
sum of the first components of the external forces is equal to zero. This means 
that the first component, P,, of the momentum vector is preserved. 

We obtained this same result earlier from Noether’s theorem. 


Remark 5. We emphasize once again that the holonomic character of some 
particular physical constraint or another (to a given degree of exactness) is a 
question of experiment. From the mathematical point of view, the holonomic 
character of a constraint is a postulate of physical origin; it can be introduced 
in various equivalent forms, for example, in the form of the principle of least 
action (1) or the D’Alembert—Lagrange principle (2), but, when defining 
the constraints, the term always refers to experimental facts which go beyond 
Newton’s equations. 


Remark 6. Our terminology differs somewhat from that used in mechanics 
textbooks, where the D’Alembert-Lagrange principle is extended to a wider 
class of systems (“non-holonomic systems with ideal constraints”). In this 
book we will not consider non-holonomic systems. We remark only that one 
example of a non-holonomic system is a sphere rolling on a plane without 
slipping. In the tangent space at each point of the configuration manifold of a 
non-holonomic system there is a fixed subspace to which the velocity vector 
must belong. 


Remark 7. If a system consists of mass points connected by rods, hinges, 
etc., then the need may arise to talk about the constraint force of some partic- 
ular constraint. 

We defined the total “constraint force of all constraints” R; for every mass 
point m,;. The concept of a constraint force for an individual constraint is 
impossible to define, as may be already seen from the simple example of a beam 
resting on three columns. If we try to define constraint forces of the columns, 
R,, R,, R; by passing to a limit (considering the columns as very rigid 
springs), then we may become convinced that the result depends on the 
distribution of rigidity. 


R? 


Figure 74 Constraint force on a rod 
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21: D’Alembert’s principle 


Problems for students are selected so that this difficulty does not arise. 


PROBLEM. A rod of weight P, tilted at an angle of 60° to the plane of a table, begins to fall 
with initial velocity zero (Figure 74). Find the constraint force of the table at the initial moment, 
considering the table as (a) absolutely smooth and (b) absolutely rough. (In the first case, the 
holonomic constraint holds the end of the rod on the plane of the table, and in the second case, 
at a given point.) 
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Because linear equations are easy to solve and study, the theory of linear 
oscillations is the most highly developed area of mechanics. In many non- 
linear problems, linearization produces a satisfactory approximate solution. 
Even when this is not the case, the study of the linear part of a problem is 
often a first step, to be followed by the study of the relation between motions 
in a nonlinear system and in its linear model. 


22 Linearization 
We give here the definition of small oscillations. 
A. Equilibrium positions 
Definition. A point Xo is called an equilibrium position of the system 
dx 
1 — = f(x), eR" 
() FG). x 
if x(t) = Xp is a solution of this system. In other words, f(x) = 0, ie., 
the vector field f(x) is zero at X. 
EXAMPLE. Consider the natural dynamical system with lagrangian function 
L(q, 4) = T — U, where T = 45. a,(q)4:4; = 0 and U = U(q): 
doL _ 6L 
dt 0q dq’ 


(2) G = (G1, --+5 Qn): 


Lagrange’s equations can be written in the form of a system of 2n first- 
order equations of form (1). We will try to find an equilibrium position: 
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Theorem. The point q = qo, 4 = Qo will be an equilibrium position if and only 
if4o = Oand qo is acritical point of the potential energy, i.e., 
oe ee 
oq qo 
Proor. We write down Lagrange’s equations 
d0T OT 2 oU 
dt dq qs qs 


From (2) it is clear that, for q = 0, we will have dT /éq = 0 and dT/dq = 0. 
Therefore, q = qo is a solution in case (3) holds and only in that case. © 


(3) 


B Stability of equilibrium positions 


We will now investigate motions with initial conditions close to an equi- 
librium position. 


Theorem. If the point qo is a strict local minimum of the potential energy U, 
then the equilibrium q = qo is stable in the sense of Liapunov. 


PRroor. Let U(qo) = h. For sufficiently small ¢ > 0, the connected com- 
ponent of the set {q: U(q) <h + e} containing qo will be an arbitrarily 
small neighborhood of q) (Figure 75). Furthermore, the connected com- 
ponent of the corresponding region in phase space p, q, {p, q: E(p,q) < 
h + e}, (where p = 0T/0q is the momentum and E = T + U is the total 
energy) will be an arbitrarily small neighborhood of the point p = 0, q = qo. 
But the region {p,q: E < h + ¢} is invariant with respect to the phase 
flow by the law of conservation of energy. Therefore, for initial conditions 
p(0), q(0) close enough to (0, qo), every phase trajectory (p(t), q(t)) is close to 
C] 


(0, qo). 
h 


E<ht+e 


q 


Figure 75 Stable equilibrium position 
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PROBLEM. Can an equilibrium position q = qo, p = 0 be asymptotically stable? 


PROBLEM. Show that in an analytic system with one degree of freedom an equilibrium position 
Qo which is not a strict local minimum of the potential energy is not stable in the sense of 
Liapunov. Produce an example of an infinitely differentiable system where this is not true. 


Remark. It seems likely that in an analytic system with n degrees of 
freedom, an equilibrium position which is not a minimum point is unstable; 
but this has never been proved for n > 2. 


C Linearization of a differential equation 


We now turn to the general system (1). In studying solutions of (1) which are 
close to an equilibrium position xy, we often use a linearization. Assume that 
Xo = 0 (the general case is reduced to this one by a translation of the co- 
ordinate system). Then the first term of the Taylor series for f is linear: 


f(x) = Ax + R,(x), A= of and R, = O(x?), 
OX|o 
where the linear operator A is given in coordinates x,,..., x, by the matrix 
aj;: 
of 
Ox; 


J 


A(x); = eax; ij = 


Definition. The passage from system (1) to the system 
(4) —-= Ay (xe R", ye TRS) 
is called the linearization of (1). 


PROBLEM. Show that linearization is a well-defined operation: the operator 
A does not depend on the coordinate system. 

The advantage of the linearized system is that it is linear and therefore 
easily solved: 

242 
y(t) = e4“"y(0), where e“* = E + At + a Teas 

Knowing the solution of the linearized system (4), we can say something 
about solutions of the original system (1). For small enough x, the difference 
between the linearized and original systems, R(x), is small in comparison 
with x. Therefore, for a long time, the solutions y(t), x(t) of both systems 
with initial conditions y(0) = x(0) = x, remain close. More explicitly, we 
can easily prove the following: 


Theorem. For any T > 0 and for any € > 0 there is a 6 > 0 such that if 
|x(0)| < 6, then |x(t) — y(t)| < 66 for all t in the interval 0 < t < T. 
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D Linearization of a lagrangian system 


We return again to the lagrangian system (2) and try to linearize it in a 
neighborhood of the equilibrium position q = q,. In order to simplify the 
formulas, we choose a coordinate system so that qo = 0. 


Theorem. [n order to linearize the lagrangian system (2) in a neighborhood of 
the equilibrium position q = 0, it is sufficient to replace the kinetic energy 
T = 4a;,(q)4;4; by its value at q = 0, 

T, = 4) 445454), ay = a; <0), 
and replace the potential energy U(q) by its quadratic part 
0?U 

0q;0q;|q= 0 

Proor. We reduce the lagrangian system to the form (1) by using the canonical 

variables p and q: 


Uz, = 4) di;9:9), by = 


ou OH . (0H 
ecm q q= a 
Since p = q = 0 is an equilibrium position, the expansions of the right-hand 
sides in Taylor series at zero begin with terms that are linear in p and q. 
Since the right-hand sides are partial derivatives, these linear terms are 
determined by the quadratic terms H, of the expansion for H(p, q). But 
H, is precisely the hamiltonian function of the system with lagrangian 
L, = T, — U3, since, clearly, H, = T,(p) + U,(q). Therefore, the linearized 
equations of motion are the equations of motion for the system described 
in the theorem with L, = T, — U;. oO 


H(p, q) = T + U. 


EXAMPLE. We consider the system with one degree of freedom: 
T = a(qg)q?, =; U = U(q). 


Let q = qo bea stable equilibrium position : (0U/0q)|q=4, = 9,(6?U/0q7)|q=40 
> 0 (Figure 76). 


U2 
U 


Figure 76 Linearization 
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As we know from the phase portrait, for initial conditions close to g = qo, 
p = 0, the solution is periodic with period t depending, generally speaking, 
on the initial conditions. The above two theorems imply 


Corollary. The period t of oscillations close to the equilibrium position qo 
approaches the limit to = 2n/wo, (where wé = b/a, b = (6?U/0q”)|q=a0> 
and a = a(qo)) as the amplitudes of the oscillations decrease. 


Proor. For the linearized system, T, = $ag? and U, = 4bq? (taking gy = 0). 
The solutions to Lagrange’s equation G = —wq have period ty = 22/wo: 


4 = C, COS Wot + cy SIN Wot 


for any initial amplitude. C] 


E Small oscillations 


Definition. Motions in a linearized system (L, = T, — U,) are called small 
oscillations** near an equilibrium q = qo. In a one-dimensional problem 
the numbers t, and @, are called the period and the frequency of small 
oscillations. 


PRos_eM. Find the period of small oscillations of a bead of mass 1 on a wire y = U(x) in a 
gravitational field with g = 1, near an equilibrium position x = x9 (Figure 77). 


U V 


mg 
XO 
x 


Figure 77 Bead on a wire 


Solution. We have 


Let x be a stable equilibrium position: (@U/éx)|,, = 0; (@?U/éx?)|,, > 0. Then the frequency 
of small oscillations, w, is defined by the formula 


“t) 

w* = (——]]| , 

& |. 

since, for the linearized system, T, = $4? and U, = 4w"q* (q = x — x9). 


41 Tf the equilibrium position is unstable, we will talk about “unstable small oscillations” 
even though these motions may not have an oscillatory character. 
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PROBLEM. Show that not only a small oscillation, but any motion of the bead is equivalent to a 
motion in some one-dimensional system with lagrangian function L = 44? — V(q). 
Hint. Take length along the wire for q. 


23 Small oscillations 


We show here that a lagrangian system undergoing small oscillations decomposes into a direct 
product of systems with one degree of freedom. 


A A problem about pairs of forms 


We will consider in more detail the problem of small oscillations. In other 
words, we consider a system whose kinetic and potential energies are 
quadratic forms 
(1) T = 3(Aq, 4) U=3(Ba,q) qeR',qeR’. 
The kinetic energy is a positive-definite form. 

In order to integrate Lagrange’s equations, we will make a special choice 
of coordinates. 

As we know from linear algebra, a pair of quadratic forms (Aq, q), (Bq, q), 


the first of which is positive-definite, can be reduced to principal axes by a 
linear change of coordinates :*? 


Q=cq Q = (Q,,..., On). 
In addition, the coordinates Q can be chosen so that the form (Aq, q) de- 
composes into the sum of squares (Q, Q). Let Q be such coordinates; then, 
since Q = Cq, we have 
1 


(2) cae 


n 7 i n 
LOr U=5 VAGi. 
=1 


i=1 


The numbers 4, are called the eigenvalues of the form B with respect to A. 


PROBLEM. Show that the eigenvalues of B with respect to A satisfy the char- 
acteristic equation 


(3) det|B — AA| =0, 

all the roots of which are, therefore, real (the matrices A and B are symmetric 
and A > 0). 

B Characteristic oscillations 


In the coordinates Q the lagrangian system decomposes into n independent 
equations 


(4) Q; = —A;Q;. 
*? If one wants to, one can introduce a euclidean structure by taking the first form as the scalar 


product, and then reducing the second form to the principal axes by a transformation which is 
orthogonal with respect to this euclidean structure. 
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Therefore we have proved: 


Theorem. A system performing small oscillations is the direct product of n one- 
dimensional systems performing small oscillations. 


For the one-dimensional systems, there are three possible cases: 
Case 1:4 = w? > 0; the solution is Q = C, cos wt + C, sin at (oscillation) 
Case 2: A = 0; the solution is Q = C, + Cyt (neutral equilibrium) 
Case 3:4 = —k? < 0; the solution is Q = C, cosh kt + C, sinh kt 
(instability) 


Corollary. Suppose one of the eigenvalues of (3) is positive: 4 = w? > 0. Then 
system (1) can perform a small oscillation of the form 


(5) q(t) = (C, cos wt + C, sin wr)é, 
where € is an eigenvector corresponding to A (Figure 78): 
BE = AAG. 
q 
Q2 : 
Q 


q1 


Figure 78 Characteristic oscillation 


This oscillation is the product of the one-dimensional motion Q; = 
C, cos w;t + C, sin w;t and the trivial motion Q; = 0 (j # i). 


Definition. The periodic motion (5) is called a characteristic oscillation of 
system (1), and the number wo is called the characteristic frequency. 


Remark. Characteristic oscillations are also called principal oscillations 
or normal modes. A nonpositive / also has eigenvectors; we will also call the 
corresponding motions “characteristic oscillations,” although they are not 
periodic; the corresponding “characteristic frequencies” are imaginary. 


PROBLEM. Show that the number of independent real characteristic oscil- 
lations is equal to the dimension of the largest positive-definite subspace for 
the potential energy 5(Bq, q). 
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Now the result may be formulated as follows: 


Theorem. The system (1) has n characteristic oscillations, the directions of 
which are pairwise orthogonal with respect to the scalar product given by 
the kinetic energy A. 


Proor. The coordinate system Q is orthogonal with respect to the scalar 
product (Aq, q) by (2). O 


C Decomposition into characteristic oscillations 
It follows from the above theorem that: 


Corollary. Every small oscillation is a sum of characteristic oscillations. 


A sum of characteristic oscillations is generally not periodic (remember 
the Lissajous figures !). 

To decompose a motion into a sum of characteristic oscillations, it is 
sufficient to project the initial conditions q, q onto the characteristic direc- 
tions §; and solve the corresponding one-dimensional problems (4). 

Therefore, the Lagrange equations for system (1) can be solved in the 
following way. We first look for characteristic oscillations of the form 
q = et. Substituting these into Lagrange’s equations 


d 
—Aq = — 
a4 Bq, 


we find 
(B — w? AE = 0. 


From the characteristic equation (3) we find n eigenvalues A, = w?. To these 
there correspond n pairwise orthogonal eigenvectors &,. A general solution 
in the case A # 0 has the form 


q(t) = Re ¥. C, e'*,. 
k=1 
Remark. This result is also true when some of the A are multiple eigen- 
values. 
Thus, in a lagrangian system, as opposed to a general system of linear 
differential equations, resonance terms of the form t sin wt, etc. do not arise, 
even in the case of multiple eigenvalues. 


D Examples 


EXAMPLE 1. Consider the system of two identical mathematical pendulums of length /, = /, = 1 
and mass m, = m, = | in a gravitational field with g = |. Suppose that the pendulums are 
connected by a weightless spring whose length is equal to the distance between the points of 
suspension (Figure 79). Denote by q, and q, the angles of inclination of the pendulums. Then 
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09090000 


Figure 79 Identical connected pendulums 
for small oscillations, T = 5(47 + 43) and U = 3(q? + q3 + a(q, — q2)*), where $a(q, — q>) 
is the potential energy of the elasticity of the spring. Set 


+ aie 
a and Q, = 1 42 
2 


‘ _Q+Q ona _@1- Q 
1 7a Qt es 


and both forms are reduced to principal axes: 
=}Q{+ 03) U = wi + 33), 


where w, = 1 and w, = \/1 + 2a (Figure 80). So the two characteristic oscillations are as 
follows (Figure 81): 


Q,= 


Then 


1. Q, = 0, i.e, g; = q2; both pendulums move in phase with the original frequency 1, and the 
spring has no effect ; 

2. Q, = 0, ie, gq; = —qz: the pendulums move in opposite phase with increased frequency 
«, > 1 due to the action of the spring. 
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V1l+2a 


Figure 80 Configuration space of the connected pendulums 


fy oor 


Figure 81 Characteristic oscillations of the connected oie: 
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Now let the spring be very weak: « < 1. Then an interesting effect called exchange of energy 
occurs. 


EXAMPLE 2. Suppose that the pendulums are at rest at the initial moment, and one of them is 
given velocity g, = v. We will show that after some time T the first pendulum will be almost 
stationary, and all the energy will have gone to the second. 

It follows from the initial conditions that Q,(0) = Q,(0) = 0. Therefore, Q, = c, sin t, and 


Q, =c, sinwt with w = \/1 + 2a 21+ (a <1). But 0,(0) = 0,(0) = »/,/2. Therefore, 
cq = v/2 andc, = v/o/2, and our solution has the form 


vf. Ts} ) s(s 1, ) 
4, = = (sint + —sin wt 42 = x (sint — —sin wt 
2 @ 2 o 


or, disregarding the term v(1 — (1/w))sin wt, which is small since is, 


Bs : ae 
on ~ 5 (sint + sin wt) = vcos ét sin w't, 


Cee ; ee 
42 © 5 (sin — sin @t) = —v cos w't sin et, 


a-l a o+1 
Ein o 

2 2 2 

The quantity ¢  a/2 is small, since « is; therefore q, undergoes an oscillation of frequency 
@' = 1 with slowly changing amplitude v cos et (Figure 82). 

After time T = 2/2e ~ n/a, essentially only the second pendulum will be oscillating; after 
2T, again only the first, etc. (“beats”) (Figure 83). 


= 


q2 


q1 
Figure 82 Beats: trajectories in the configuration space 
q) q2 
Figure 83 Beats 
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Figure 84 Connected pendulums 


92 


1 


Figure 85 Potential energy of strongly connected pendulums 


EXAMPLE 3. We investigate the characteristic oscillations of two different pendulums (m, # m2, 
1, ¥ l,,g = 1), connected by a spring with energy $0(q, — 42)" (Figure 84). How do the charac- 
teristic frequencies behave as « > 0 or as « > 00? 

We have 


T = 30m, /747 + mz1343) 


2 2 
q q a 
U=ml, ‘ + ml, * ot 5% — 42). 


Therefore (Figure 85), 


iis mz 0 pa! ml, +a —o 
0 mB —o ml, + 0 
and the characteristic equation has the form 


— dm, 2 = 
det(B — 4A) = Gia +a—Am,2? a ) _ 


—a ml, +a — Am,33 
or 
ad? — (by + b,a)A + (co + €1%) = 0, 
where 
a= mm, 1713 
bo = mlymzI,(1, + 1) b, = m2 + m1 
Co = Mymy 1,1, cy = ml, + mgly. 
This is the equation of a hyperbola in the (x, A)-plane (Figure 86). As « — 0 (weak spring) the 


frequencies approach the frequencies of free pendulums (7, = 17,4); as «> 00, one of the 
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a 


we wr 


Figure 86 Dependence of characteristic frequencies on the stiffness of the spring 


Z 
m2 
my) 


Figure 87 Limiting case of pendulums connected by an infinitely stiff spring 


A= w? 
«3 


frequencies tends to oo, while the other approaches the characteristic frequency w,, of a pendu- 
lum with two masses on one rod (Figure 87): 


2 _ ml, + ml, 
° ml? + m2 


PROBLEM. Investigate the characteristic oscillations of a planar double pendulum (Figure 88). 


PROBLEM. Find the shape of the trajectories of the small oscillations of a point mass on the plane, 
sitting inside an equilateral triangle and connected by identical springs to the vertices (Figure 89). 


Z 
ly 
mM] 
lL 
m2 


Figure 88 Double pendulum 


a 


Figure 89 System with an infinite set of characteristic oscillations 
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Solution. Under rotation by 120° the system is mapped onto itself. Consequently, all direc- 
tions are characteristic, and both characteristic frequencies are the same: U = 4w(x? + y’). 
Therefore, the trajectories are ellipses (cf. Figure 20). 


24 Behavior of characteristic frequencies 


We prove here the Rayleigh-Courant-Fisher theorem on the behavior of characteristic fre- 
quencies of a system under increases in rigidity and under imposed constraints. 


A. Behavior of characteristic frequencies under a 

change in rigidity 
Consider a system performing small oscillations, with kinetic and potential 
energies 


T = (Aq,q)>0 and U =4(Bq,q)>0 forallg,q 40. 


Definition. A system with the same kinetic energy, and a new potential energy 
U’, is called more rigid if U' = 4(B’q, q) > 3(Bq, q) = U for all q. 


We wish to understand how the characteristic frequencies change under 
an increase in the rigidity of a system. 


Prosem. Discuss the one-dimensional case. 


Theorem 1. Under an increase in rigidity, all the characteristic frequencies 
are increased, i.e., if @1 < Wz < ++: < w, are the characteristic frequencies 
of the less rigid system, and w, < w <--+ < w, are the characteristic 
frequencies of the more rigid system, then @, < 0; @2 < @2;...;@, < @). 


This theorem has a simple geometric meaning. Without loss of generality 
we may assume that A = E, ie., that we are considering the euclidean struc- 
ture given by the kinetic energy T = 3(q, q). To each system we associate the 
ellipsoids E: (Bq, q) = 1 and E’: (B’q, q) = 1. 

It is clear that 


Lemma 1. If the system U' is more rigid than U, then the corresponding 
ellipsoid E’ lies inside E. 


It is also clear that 


Lemma 2. The major semi-axes of the ellipsoid are the inverses of the char- 
acteristic frequencies w;: 0; = 1/a;. 


Therefore, Theorem | is equivalent to the following geometric proposition 
(Figure 90). 
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FES 


CM, 
Figure 90 The semi-axes of the inside ellipse are smaller. 


Theorem 2. If the ellipsoid E with semi-axes a, > a, > --- > a, contains the 
ellipsoid E’ with semi-axes a > a, >--- >a‘, both ellipses having the 
same center, then the semi-axes of the inside ellipsoid are smaller: 


Q, > Q},a, > ay,...,a, > a,,. 


ExampPLe. Under an increase in the rigidity a of the spring connecting the pendulums of Example 
3, Section 23, the potential energy grows, and by Theorem 1, the characteristic frequencies grow: 
dw,/da > 0. 

Now consider the case when the rigidity of the spring approaches infinity, « + co. Then in 
the limit the pendulums are rigidly connected and we get a system with one degree of freedom; 
the limiting characteristic frequency w,, satisfies w, < w,, < wy. 


B Behavior of characteristic frequencies under the 
imposition of a constraint 


We return to a general system with n degrees of freedom, and let T = 4(q, q) 
and U = 3(Bq, q) (q€ R") be the kinetic and potential energies of a system 
performing small oscillations. 


(6q,q) =1 
R"! 


Figure 91 Linear constraint 


Let R"~' c R" be an (n — 1)-dimensional subspace in R” (Figure 91). 
Consider the system with n — 1 degrees of freedom (q € R"~ ') whose kinetic 
and potential energies are the restrictions of T and U to R"~!. We say that 
this system is obtained from the original by imposition of a linear constraint. 

Let w, < @, <--- < @, be the n characteristic frequencies of the original 


system, and 
O,<0,<°::-<@ 


>“ n-1 


the (n — 1) characteristic frequencies of the system with a constraint. 
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Figure 92 Separation of frequencies 


Theorem 3. The characteristic frequencies of the system with a constraint 
separate the characteristic frequencies of the original system (Figure 92): 


0, SW, $<) SO, <-++-SO,_-) < O,-1) < O,. 


By Lemma 2 this theorem is equivalent to the following geometric propo- 
sition. 


Theorem 4. Consider the cross-section of the n-dimensional ellipsoid E = 
{q: (Bq, q) = 1} with semi-axes ay > a, > +--+ > a, by a hyperplane R"! 
through its center. Then the semi-axes of this (n — 1)-dimensional ellip- 
soid—the cross-section E'—separate the semi-axes of the ellipsoid E' 
(Figure 93): 


t , , 
a, 2 ay > a2 2 AQ 2 e+ SPA, [> Ay-1 [> A,.- 


Figure 93 The semi-axes of the intersection separate the semi-axes of the ellipsoid 


C Extremal properties of eigenvalues 


Theorem 5. The smallest semi-axis of any cross-section of the ellipsoid E with 
semi-axes a, >a, >--- >a, by a subspace R* is less than or equal to a,: 


aq, = max min |x| 
{Rk} xeERENE 
(the upper bound is attained on the subspace spanned by the semi-axes 
a, 2a4,2>°°° > a). 


> a,. Its dimension isn — k + 1. Therefore, it intersects R*. Let x be a point 
of the intersection lying on the ellipsoid. Then ||x|| < a,, since xe R"-**?. 


Proor.*> Consider the subspace R"~**! spanned by the axes a, > a4, >-°° 


43 It is useful to think of the case n = 3, k = 2. 
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Since | < ||x||, where / is the length of the smallest semi-axis of the ellipsoid 
E © R*, 1 must be no larger than a,. O 


PROOF OF THEOREM 2. The smallest semi-axis of every k-dimensional 
section of the inner ellipsoid R* 4 E’ is less than or equal to the smallest 
semi-axis of R* 7 E. By Theorem 5, 


a, = max min |x|] < max min |x| = a. | 
{R*} xeRE aE’ {Rk} xeRKONE 


PROOF OF THEOREM 4. The inequality aj, < a, follows from Theorem 5, 
since in the calculation of a, the maximum is taken over a larger set. To prove 
the inequality aj, > a,.,, we intersect R"~! with any k + 1-dimensional 
subspace R**!. The intersection has dimension greater than or equal to k. 
The smallest semi-axis of the ellipsoid E’ 7 R‘*! is greater than or equal to 


the smallest semi-axis of E ~\ R**1. By Theorem 5, 


a= max min ||x|| > max min |x| 
{Rk Cc R"-1} xe REE’ {RE+1 CR xeREt1 QE’ 
> max min |x|] = ay44. | 


{Re+1 cR xeERKt1AE 
Theorems 1 and 3 follow directly from those just proven. 


PROBLEM. Show that if we increase the kinetic energy of a system without 
decreasing the potential energy (for example, we increase the mass on a given 
spring), then every characteristic frequency decreases. 


PROBLEM. Show that under the orthogonal projection of an ellipsoid lying in one subspace of 
euclidean space onto another subspace, all the semi-axes are decreased. 


PROBLEM. Suppose that a quadratic form A(«) on euclidean space R" is a continuously differen- 
tiable function of the parameter ¢. Show that every characteristic frequency depends differen- 
tiably on ¢, and find the derivatives. 


ANSWER. Let 1,,..., 4, be the eigenvalues of A(0). To every eigenvalue A; of multiplicity v; there 
corresponds a subspace R”. The derivatives of the eigenvalues of A(e) at 0 are equal to the 
eigenvalues of the restricted form B = (dA/de)|,-.) on R”. 

In particular, if all the eigenvalues of A(0) are simple, then their derivatives are equal to the 
diagonal elements of the matrix B in the characteristic basis for A(0). 

It follows from this problem that when a form is increased, its eigenvalues grow. In this way 
we obtain new proofs of Theorems 1 and 2. 


PRos_em. How does the pitch of a bell change when a crack appears in the bell? 


25 Parametric resonance 


If the parameters of a system vary periodically with time, then an equilibrium position can be 
unstable, even if it is stable for each fixed value of the parameter. This instability is what makes it 
possible to swing on a swing. 
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A Dynamical systems whose parameters vary 
periodically with time 


EXAMPLE |. A swing: the length of the equivalent mathematical pendulum 
I(t) varies periodically with time: I(t + T) = ((t) (Figure 94). 


4 


/ 


Figure 94 Swing 


EXAMPLE 2. A pendulum in a periodically varying gravitational field (for 
example, the moon) is described by Hill’s equation: 


(1) G=—@(0q at + T) = a(t) 


EXAMPLE 3. A pendulum suspended from a point which periodically oscillates 
vertically is also described by an equation of the form (1). 


For systems with periodically varying parameters the right-hand side of 
the equations of motion are periodic functions of t. The equations of motion 
can be written in the form of a system of first-order ordinary differential 
equations 
(2) x = f(x, 1) f(x,t + T) = f(x, 0), xeR" 


with periodic right-hand sides. For example, Equation (1) can be written as 
the system 


(3) 


X, = Xz 


i * an, fot + T) = oft). 


B The mapping at a period 


Recall the general properties of the system (2). We denote by g’: R" > R" the 
mapping taking x € R" to the value at time t, g'x = @(t), of the solution @ of 
system (2) with initial conditions @(0) = x (Figure 95). 
The mappings g‘ do not form a group: in general, 
g*s # g'¢° # gg". 


PROBLEM. Show that {g‘} is a group if and only if the right-hand sides f do not 
depend on t. 


PROBLEM. Show that, if T is the period of f, then g’** = g*-g’ and, in 
particular, g"’ = (g’)", so that the mappings g”? (n an integer) form a group. 
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t 


Figure 95 Mapping at a period 


The mapping g’: R" > R" plays an important role in what is to come; we 
will call it the mapping at a period and will denote it by 


A:R" > R" Ax(0) = x(T). 
EXAMPLE. For the systems 
. =X, . =x 
Xp = —-X, X) = -X, 


which can be considered periodic with any period T, the mapping A is a rotation or a hyper- 
bolic rotation (Figure 96). 


x2 


Figure 96 Rotation and hyperbolic rotation 


Theorem. 


1. The point Xo is a fixed point of the mapping A (AXo = Xo) if and only if the 
solution with initial conditions x(0) = Xq is periodic with period T. 

2. The periodic solution x(t) is Liapunov stable (asymptotically stable) if and 
only if the fixed point x of the mapping A is Liapunov stable (asymptoti- 
cally stable).** 

3. If the system (2) is linear, i.e., f(x, t) = f(t)x is a linear function of x, 
then A is linear. 

4. If the system (2) is hamiltonian, then A preserves volume: det A, = 1. 


44 A fixed point x9 of the mapping A is Liapunov stable (respectively, asymptotically stable) if 
Ye > 0, 36 > 0 such that if |x — xo| < 6, then |A"x — A"x9| < € for all 0 <n < x (respec- 


tively, A"x — A"X) > Oasn — 00). 
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Proor. Assertions (1) and (2) follow from the relationship g?** = g°A. 
Assertion (3) follows from the fact that a sum of solutions of a linear system 
is again a solution. Assertion (4) follows from Liouville’s theorem. oO 


We apply the theorem above to the mapping A of the phase plane {(x,, x2)} 
onto itself, corresponding to the equation (1) and the system (3). Since (3) is 
linear and hamiltonian (H = 4w?x? + 4x3), we get: 


Corollary. The mapping A is linear, and preserves area (det A = 1). The trivial 
solution of Equation (1) is stable if and only if the mapping A is stable. 


PROBLEM. Show that a rotation of the plane is a stable mapping, and a 
hyperbolic rotation is unstable. 


C Linear mappings of the plane to itself which 
preserve area 


Theorem. Let A be the matrix of a linear mapping of the plane to itself which 
preserves area (det A = 1). Then the mapping A is stable if |tr A| < 2, and 
unstable if |tr A| > 2 (tr A = ay, + ap). 


Proor. Let 4, and /, be the eigenvalues of A. They satisfy the characteristic 
equation A? — (tr A)A + 1 = 0 with real coefficients 4, + 4, = tr A and 
A,-A, = det A = 1. The roots A, and A, of this real quadratic equation are 
real for |tr A| > 2 and complex conjugate for |tr A| < 2. 

In the first case one of the eigenvalues has absolute value greater than 1, 
and one has absolute value less than 1; the mapping A is a hyperbolic 
rotation and is unstable (Figure 97). 


Figure 97 Eigenvalues of the mapping A 


In the second case the eigenvalues lie on the unit circle (Figure 97): 
L=,-4, =A, Ay iw, [A,/?. 


The mapping A is equivalent to a rotation through angle « (where A, , = 
e*') i.e., it may be reduced to a rotation by means of an appropriate choice of 
coordinates on the plane. Therefore, it is stable. O 


In this way, every question about the stability of the trivial solution of an 
equation of the form (1) is reduced to computation of the trace of the matrix 
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A. Unfortunately, the calculation of this trace can be done explicitly only in 
special cases. It is always possible to find the trace approximately by numeri- 
cally integrating the equation on the interval 0 < t < T. In the important 
case when c(t) is close to a constant, some simple general arguments can help. 


D Strong stability 


Definition. The trivial solution of a hamiltonian linear system is strongly 
stable if it is stable, and if the trivial solution of every sufficiently close 
linear hamiltonian system is also stable.** 


The two theorems above imply: 


Corollary. If |tr A| < 2, then the trivial solution is strongly stable. 


ProoF. If |tr A| < 2, then a mapping A’ corresponding to a sufficiently close 
system will also have |tr A’| < 2. O 


Let us apply this to a system with almost constant (only slightly varying) 
coefficients. Consider, for example, the equation 
(4) X= —@7(1 + ea(t))x, €e<1 


where a(t + 22) = a(t), e.g., a(t) = cost (Figure 98) (a pendulum whose 
frequency oscillates near w with small amplitude and period 27).*° 


Figure 98 Instantaneous frequency as a function of time 


We will represent each system of the form (4) by a point in the plane of 
parameters ¢, w > 0. Clearly, the stable systems with |tr A| < 2 form an 
open set in the (@, ¢)-plane; so do the unstable systems with |tr A| > 2 
(Figure 99). 

The boundary of stability is given by the equation |tr A| = 2. 


Theorem. All points on the w-axis except the integers and half-integers 
w = k/2,k = 0, 1, 2,... correspond to strongly stable systems (4). 


45 The distance between two linear systems with periodic coefficients, x = B,(t)x, X = B,(t)x, 
is defined as the maximum over t of the distance between the operators B,(t) and B,(t). 


46 In the case a(t) = cos t, Equation (4) is called Mathieu's equation. 
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e<] 


Figure 99 Zones of parametric resonance 


Thus, the set of unstable systems can approach the w-axis only at the 
points m = k/2. In other words, swinging a swing by small periodic changes 
of the length is possible only in the case when one period of the change in 
length is close to a whole number of half-periods of characteristic oscillations 
—a result well known experimentally. 

The proof of the theorem above is based on the fact that fore = 0, Equation 
(4) has constant coefficients and is clearly solvable. 


PROBLEM. Calculate the matrix of the transformation A after period T = 2x 
in the basis x, x for system (4) with e = 0. 


Solution. The general solution is: 
xX =, CoS wt + Cy Sin at. 
The solution with initial conditions x = 1, X = Ois: 
x = cos at Xx = —@sin wt. 


The solution with initial conditions x = 0, x = lis: 


x = — sin wt X= cos wt. 
wo 


ANSWER. 
1, 
cos 27m — sin 2xw 
A= @ 


—wsin2n@ cos 22m 


Therefore, |tr A| = |2 cos 2ma| < 2 if w #k/2, k = 0, 1, ..., and the 
theorem follows from the preceding corollary. 

A more careful analysis*’ shows that in general (and for a(t) = cos t) 
the region of instability (shaded in Figure 99) in fact approaches the w-axis 
near the points w = k/2,k = 1,2,.... 


#7 Cf., for example, the problem analyzed below. 
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Thus, for w x k/2, k = 1, 2, ..., the lowest equilibrium position of the 
idealized swing (4) is unstable and it swings under an arbitrarily small 
periodic change of length. This phenomenon is called parametric resonance. 
A characteristic property of parametric resonance is that it is strongest when 
the frequency of the variation of the parameter v (in Equation (4), v = 1) 
is twice the characteristic frequency w. 

Remark. Theoretically, parametric resonance can be observed for the 
infinite collection of cases w/1 ~ k/2,k = 1, 2,.... In practice, it is usually 
observed only when k is smal] (k = 1, 2, and more rarely, 3). The reason is 
that: 


1. For large k the region of instability approaches the w-axis in a very narrow 
“tongue” and the resonance frequencies w must satisfy very rigid bounds 
(~«6*, where 0 € (0, 1) depends on the width of the analyticity band for the 
function a(t) in (4)). 

2. The instability itself is weak for large k, since |tr A| — 2 is small and the 
eigenvalues are close to | for large k. 

3. If there is an arbitrarily small amount of friction, then there is a minimal 
value ¢, of the amplitude in order for parametric resonance to begin (for ¢ 
less than this the oscillation dies out). As k grows, €, grows quickly (Figure 
100). 


\Z | 


Figure 100 Influence of friction on parametric resonance 


We also notice that for Equation (4) the size of x grows without bound in 
the unstable case. In real systems, oscillations attain only finite amplitudes, 
since for large x the linear equation (4) itself loses influence, and we must 
consider the nonlinear effects. 


Proscem. Find the shape of the region of stability in the éw-plane for the system described by 
the equations 


: “2 wore O<t<a 
v= —-f"()x  f(N = e<l 
Ore nm<t<2n 
f(t + 2n) = f(t). 
Solution. It follows from the solution of the preceding problem that A = A, A,, where 


1 


Cr —S 
Ay = @ 
TOS, Cy 

Cy = COS Nar,, S, = SIN TO,,W, 7 =Wte. 
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Therefore, the boundary of the zone of stability has the equation 


@, W2 
2c4¢. — {— + —]5152 
OM. Wy 


(5) |tr Al = =2. 


Since ¢ < 1, we have w,/w, = (@ + &)/(w — &) = 1. We introduce the notation 
0) a) 
—4+2=21 +A). 
OM, W, 


Then, as is easily computed, A = (2e?/w?) + O(e*) « 1. Using the relations 2c,c, = 
cos 2m + cos 2mm and 2s,;s, = cos 2me — cos 2mm, we rewrite Equation (5) in the form 


—Acos 2ne + (2 + A)cos 21m = +2 


or 
2 + Acos 27¢ 
6 220 = ———____ 
(6a) cos 2nw eee 
—2 + Acos 27€ 
(6b) cos 2x@ = ————_—___ 
2+A 


In the first case cos 2nw ~ 1. Therefore, we set 
oa=k+a,\a| <1 cos 2nw = cos 2na = 1 — 2n?a? + O(a‘). 

We rewrite Equation (6a) in the form 

2 1 a a 2ne) 

s 2mm = 1 — ——— (1 — cos 2z¢ 
ee 2+A 
or 2n7a? + O(a*) = Ane? + O(e4). 
Substituting in the value A = (2¢?/w?) + O(e*), we find 
2 


2 

& 2 & 
a=+-— + 0(), ie, o=k+ 5+ o(e?). 

a) k 


Equation (6b) is solved analogously; for the result we get 


Therefore the answer has the form depicted in Figure 101. 


— 1 


Figure 101 Zones of parametric resonance for f= w + «. 
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E Stability of an inverted pendulum with vertically 
oscillating point of suspension 


PROBLEM. Can the topmost, usually unstable, equilibrium position of a 
pendulum become stable if the point of suspension oscillates in the vertical 
direction (Figure 102)? 


mg 


| 


a Parabola 


Figure 102 Inverted pendulum with oscillating point of suspension 


Let the length of the pendulum be /, the amplitude of the oscillation of the 
point of suspension be a < I, the period of oscillation of the point of suspen- 
sion 21, and, moreover, in the course of every half-period let the acceleration 
of the point of suspension be constant and equal to +c (then c = 8a/t’). It 
turns out that for fast enough oscillations of the point of suspension (t < 1) 
the topmost equilibrium becomes stable. 


Solution. The equation of motion can be written in the form ¥ = (w? + d”)x (the sign changes 
after time 1), where w? = g/l and d? = c/l. If the oscillation of the suspension is fast enough, 
then d? > w? (d? = 8a/It?). 

As in the previous problem, A = A, A,, where 


1 
ch kt sh kt cos Qt = sin Qt 
A, = k A, = Q 
kshkt  chkt —QsinQt cos Qt 
kK? = d* + @?, Q = @? — w?. 


The stability condition |tr A| < 2 therefore has the form 


<2 


(7) 


k Q 
2 ch kt cos Qtr +(f- st kt sin Qt 


We will show that this condition is fulfilled for sufficiently fast oscillations of the point of 
suspension, i.e., when c > g. We introduce the dimensionless variables ¢, 1: 


=< Gowri. 


c 
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Then 


kt = 2,/20/1 + Qt = 2/2e/1 — 2 


k Q 1+ 1-2 
Sia N SF = 2p? + O(u4). 
O £ Niee View 


Therefore, for small ¢ and pu we have the following expansion with error o(e* + py‘): 


chkt = 1+ 4e7(1 +?) + 8ct+--- cosQr = 1 — 4e?(1 — p?) + Se4 + --- 


k. Q 
(4 - sh kt sin Qt = 1667p? + --- 


so the stability condition (7) takes the form 
211 — 164 + 48e4 + 8e2u? +--+) + 1667p? < 2, 


ie., disregarding the small higher-order terms, $16e* > 32y7e? or w < ¢,/2/3, or g/c < 2a/3l. 
This condition can be rewritten as 


{3 1 I 
N> /—o-7%0.220-, 
~ 64a On 


where N = 1/2 is the number of oscillations of the point in one unit of time. For example, if the 
length of the pendulum | is 20 cm, and the amplitude of the oscillation of the point of suspension 
ais 1 cm, then 


[980 
N > 0.22 36. ® 31 (oscillations per second). 


For example, the topmost position is stable if the frequency of oscillation of the point of 
suspension is greater than 40 per second. 
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In this chapter we study in detail some very special mechanical problems. 
These problems are traditionally included in a course on classical mechanics, 
first because they were solved by Euler and Lagrange, and also because we 
live in three-dimensional euclidean space, so that most of the mechanical 
systems with a finite number of degrees of freedom which we are likely to 
encounter consist of rigid bodies. 


26 Motion in a moving coordinate system 
In this paragraph we define angular velocity. 


A Moving coordinate systems 


We look at a lagrangian system described in coordinates q, t by the lagrangian 
function L(q, q, t). It will often be useful to shift to a moving coordinate 
system Q = Q(q, ¢). 

To write the equations of motion in a moving system, it is sufficient to 
express the lagrangian function in the new coordinates. 


Theorem. If the trajectory y: q = @(t) of Lagrange’s equations d(0L/0q)/dt = 
6L/éq is written as y: Q = W(t) in the local coordinates Q, t (where Q = 
Q(q, t)), then the function ®(t) satisfies Lagrange’s equations d(dL'/8Q)/dt = 
6L'/6Q, where L'(Q, Q, t) = L(q,q, t). 


Proor. The trajectory y is an extremal: 5, L(q, 4, t)dt = 0. Therefore, 
6 J, L(Q, Q, t)dt = 0 and (1) satisfies Lagrange’s equations. O 
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B Motions, rotations, and translational motions 


We consider, in particular, the important case where q is the cartesian radius 
vector of a point relative to an inertial coordinate system k (which we will 
call stationary), and Q is the cartesian radius vector of the same point relative 
to a moving coordinate system K. 


Definition. Let k and K be oriented euclidean spaces. A motion of K relative 
to k is a mapping smoothly depending on ft: 
D,: K > k, 


which preserves the metric and the orientation (Figure 103). 


Figure 103 The motion D, decomposed as the product of a rotation B, and transla- 
tion C, 


Definition. A motion D, is called a rotation if it takes the origin of K to the 
origin of k, ie., if D, is a linear operator. 


Theorem. Every motion D, can be uniquely written as the composition of a 
rotation B,: K + k and a translation C,:k > k: 
D, = C,B,,; 
where C,q = q + r(t), (q, re k). 
ProoF. We set r(t) = D,0, B, = C; 'D,. Then B,0 = 0. O 


Definition. A motion D, is called translational if the mapping B,: K > k 
corresponding to it does not depend ont: B, = By = B,D,Q = BQ + rit). 


We will call k a stationary coordinate system, K a moving one, and 
q(t) € k the radius-vector of a point moving relative to the stationary system; 
if 
(1) q(t) = D,QUt) = B,Q(W) + W(t) 


(Figure 104), Q(t) is called the radius vector of the point relative to the moving 
system. 


Warning. The vector B,Q(t)€k should not be confused with Q(t)e« K— 
they lie in different spaces! 
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B,Q(t) 


q(t) 


r(t) | 
K 


Figure 104 Radius vector of a point with respect to stationary (q) and moving (Q) 
coordinate systems 


C Addition of velocities 


We will now express the “absolute velocity” q in terms of the relative motion 
Q(t) and the motion of the coordinate system, D,. By differentiating with 
respect to t in formula (1) we find a formula for the addition of velocities 
(2) 4 = BQ + BQ +i. 


In order to clarify the meaning of the three terms in (2), we consider the 
following special cases. 


The case of translational motion (B = 0) 
In this case Equation (2) gives q = BQ + ¢. In other words, we have shown 


Theorem. [f the moving system K has a translational motion relative to k, then 
the absolute velocity is equal to the sum of the relative velocity and the 
velocity of the motion of the system K: 


(3) V=V+V, 
where 
v = qek is the absolute velocity, 
v’ = BQek is the relative velocity (distinct from Q € K!) 


Vo = frek is the velocity of motion of the moving coordinate system. 


D Angular velocity 


In the case of a rotation of K the relationship between the relative and ab- 
solute velocities is not so simple. We first consider the case when our point is 
at rest in K (i.e., Q = 0) and the coordinate system K rotates (i.e., r = 0). 
In this case the motion of the point q(t) is called a transferred rotation. 


EXAMPLE. Rotation with fixed angular velocity w€k. Let U(t):k > k be the 
rotation of the space k around the w-axis through the angle |@|t. Then 
B(t) = U(t)B(0) is called a uniform rotation of K with angular velocity o. 
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€ 


O 
Figure 105 Angular velocity 


Clearly, the velocity of the transferred motion of the point q in this case is 
given by the formula (Figure 105) 


q = [o, q]. 


We now turn to the general case of a rotation of K (r = 0,Q = 0). 


Theorem. At every moment of time t, there is a vector w(t)€k such that the 
transferred velocity is expressed by the formula 


(4) 4 = [o, q], Vqek. 


The vector @ is called the instantaneous angular velocity; clearly, it is 
defined uniquely by Equation (4). 


Corollary. Suppose that a rigid body K rotates around a stationary point 0 of 
the space k. Then at every moment of time there exists an instantaneous axis 
of rotation—the straight line in the body passing through 0 such that the 
velocity of its points at the given moment of time is equal to zero. The 
velocity of the remaining points is perpendicular to this straight line and is 
proportional to the distance from it. 


The instantaneous axis of rotation in k is given by its vector w; in K the 
corresponding vector is denoted by Q = B~'we K; Qis called the vector of 
angular velocity in the body. 


EXAMPLE. The angular velocity of the earth is directed from the center to the North Pole; its 
length is equal to 27/3600 - 24 sec! = 7.3-1075 sec™?. 


PROOF OF THE THEOREM. By (2) we have 
q = BQ. 


Therefore, if we express Q in terms of q, we get q = BB~'q = Aq, where 
A = BB" !:k +k isa linear operator on k. 
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Lemma 1. The operator A is skew-symmetric: A’ + A = 0. 


Proor. Since B: K > k is an orthogonal operator from one euclidean space 
to another, its transpose is its inverse: B' = B~':k > K. By differentiating 
the relationship BB' = E with respect to t, we get 


BB'+ BB’=0 ~=BB-1 + (BBo')' =0. oO 


Lemma 2. Every skew-symmetric operator A on a three-dimensional oriented 
euclidean space is the operator of vector multiplication by a fixed vector: 


Aq = [w,q] forall qe R®. 


Proor. The skew-symmetric operators from R? to R? form a linear space. 
Its dimension is 3, since a skew-symmetric 3 x 3 matrix is determined by its 
three elements below the diagonal. 

The operator of vector multiplication by @ is linear and skew-symmetric. 
The operators of vector multiplication by all possible vectors w in three- 
space form a linear subspace of the space of all skew-symmetric operators. 

The dimension of this subspace is equal to 3. Therefore, the subspace of 
vector multiplications is the space of all skew-symmetric operators. O 


CONCLUSION OF THE PROOF OF THE THEOREM. By Lemmas 1 and 2, 
q = Aq = [o, q]. ea 


In cartesian coordinates the operator A is given by an antisymmetric 
matrix; we denote its elements by +@, > 3: 


0 —0,; 
A= M3 0 —, 
—@, 0 


In this notation the vector @ = w,e, + @,e, + @3e; will be an eigenvector 
with eigenvalue 0. By applying A to the vector q = q,e,; + gre, + 43e3, 
we obtain by a direct calculation 


Aq = [o, q]. 


E Transferred velocity 


The case of purely rotational motion 


Suppose now that the system K rotates (r = 0), and that a point in K 
is moving (Q ¥ 0). From (2) we find (Figure 106) 


4d = BQ + BO = [a,q] + Vv. 


In other words, we have shown 
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oO 
Figure 106 Addition of velocities 


Theorem. If a moving system K rotates relative to O€k, then the absolute 
velocity is equal to the sum of the relative velocity and the transferred 
velocity: 


Y=V+Y,, 
where 
v = qek is the absolute velocity 
(5) V = BQeEk is the relative velocity 


v, = BQ = [a, q] Ek is the transferred velocity of rotation. 


Finally, the general case can be reduced to the two cases above, if we 
consider an auxiliary system K, which moves by translation with respect to 
k and with respect to which K moves by rotating around 0e€ K,. From 
formula (2) one can see that 


v=V+Y¥,+ Vo, 
where 
v = qek is the absolute velocity, 
v’ = BQek is the relative velocity, 
v, = BQ = [,q — r]€k is the transferred velocity of rotation, 
and 
Vo = rek is the velocity of motion of the moving coordinate system. 


PROBLEM. Show that the angular velocity of a rigid body does not depend on 
the choice of origin of the moving system K in the body. 


PROBLEM. Show that the most general movement of a rigid body is a helical 
movement, i.e., the composition of a rotation through angle g@ around some 
axis and a translation by h along it. 


PROBLEM. A watch lies ona table. Find the angular velocity of the hands of the watch: (a) relative 
to the earth, (b) relative to an inertial coordinate system. 
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Hint. If we are given three coordinate systems k, K,, and K,, then the angular velocity of K, 
relative to k is equal to the sum of the angular velocities of K, relative to k and of K, relative 
to K,, since 


(E+ Ayt+-:- E+ Apt t+--)=E+(A, + A,itt+-:-. 


27 Inertial forces and the Coriolis force 


The equations of motion in a non-inertial coordinate system differ from the equations of motion 
in an inertial system by additional terms called inertial forces. This allows us to detect experi- 
mentally the non-inertial nature of a system (for example, the rotation of the earth around its 
axis). 


A Coordinate systems moving by translation 


Theorem. In a coordinate system K which moves by translation relative to an 
inertial system k, the motion of a mechanical system takes place as if the 
coordinate system were inertial, but on every point of mass m an additional 
“inertial force” acted: F = — mf, where tf is the acceleration of the system K. 


Proor. If Q = q — r(t), then mQ = mq — mi. The effect of the translation of 
the coordinate system is reduced in this way to the appearance of an addi- 
tional homogeneous force field—mW, where W is the acceleration of the 
origin. O 


} 


m(g—r) 


THINS 
Ulisse 


Figure 107 Overload 


ExamPLe |. At the moment of takeoff, a rocket has acceleration f directed upward (Figure 107). 
Thus, the coordinate system K connected to the rocket is not inertial, and an observer inside can 
detect the existence of a force field mW and measure the inertial force, for example, by means of 
weighted springs. In this case the inertial force is called overload.* 


ExaAmpLe 2. When jumping from a loft, a person has acceleration g, directed downwards. Thus, 
the sum of the inertial force and the force of gravity is equal to zero; weighted springs show that 
the weight of any object is equal to zero, so such a state is called weightlessness. In exactly the 
same way, weightlessness is observed in the free ballistic flight of a satellite since the force of 
inertia is opposite to the gravitational force of the earth. 


ExampPLe 3. If the point of suspension of a pendulum moves with acceleration W(t), then the 
pendulum moves as if the force of gravity g were variable and equal to g — W(r). 


* Translator’s note. The word overload is the literal translation of the Russian term peregruzka. 
There does not seem to be an English term for this particular kind of inertial force. 
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B Rotating coordinate systems 

Let B,: K > k be a rotation of the coordinate system K relative to the sta- 
tionary coordinate system k. We will denote by Q(t) € K the radius vector of 
a moving point in the moving coordinate system, and by q(t) = B,Q(t)ek 
the radius vector in the stationary system. The vector of angular velocity in 
the moving coordinate system is denoted, as in Section 26, by 2. We assume 
that the motion of the point q in k is subject to Newton’s equation mq = 


f(q, 4). 
Theorem. Motion in a rotating coordinate system takes place as if three addi- 
tional inertial forces acted on every moving point Q of mass m: 


1. the inertial force of rotation: m[Q, QI], 
2. the Coriolis force: 2m[Q, Q], and 
3. the centrifugal force: m[Q, [Q, QT]. 


Thus 
mQ = F — m[Q, Q] — 2m[Q, Q] — m[Q, [Q, QT, 
where 


BF(Q, Q) = f(BQ, (BQ) ). 


The first of the inertial forces is observed only in nonuniform rotation. 
The second and third are present even in uniform rotation. 


Figure 108 Centrifugal force of inertia 


The centrifugal force (Figure 108) is always directed outward from the 
instantaneous axis of rotation Q; it has magnitude |Q|?r, where r is the 
distance to this axis. This force does not depend on the velocity of the relative 
motion, and acts even on a body at rest in the coordinate system K. 

The Coriolis force depends on the velocity Q. In the northern hemisphere 
of the earth it deflects every body moving along the earth to the right, and 
every falling body eastward. 
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PROOF OF THE THEOREM. We notice that for any vector X€K we have 
BX = B[Q, Xj. In fact, by Section 26, BX = [w, x] = [BQ, BX]. This is 
equal to B[Q, X] since the operator B preserves the metric and orientation, 
and therefore the vector product. ; : 

Since q = BQ we see that q = BQ + BQ = B(Q + [Q, Q)). Differenti- 
ating once more, we obtain 


d= BQ + (9, Q) + BO + [2,.Q1+ [OD 
= BQ, (Q + [2, QD] + O + [2 Q1 + (2, aD 
= BQ + 20, Q] + [2 (2, QT) + (9, QD. o 


(We again used the relationship BX = B[Q, X]; this time X = Q + 
[Q, Q].) 


We will consider in more detail the effect of the earth’s rotation on laboratory experiments. 
Since the earth rotates practically uniformly, we can take Q = 0. The centrifugal force has its 
largest value at the equator, where it attains Q?p/g ~ (7.3 x 10~5)?-6.4 x 10°/9.8 = 3/1000 
the weight. Within the limits of a laboratory it changes little, so to observe it one must travel 
some distance. Thus, within the limits of a laboratory the rotation of the earth appears only in 
the form of the Coriolis force: in the coordinate system Q associated to the earth, we have, with 
good accuracy, 


d . ‘ 
i mQ = mg + 2m[Q, Q] 
(the centrifugal force is taken into account in g). 
EXAMPLE |. A stone is thrown (without initial velocity) into a 250 m deep mine shaft at the 


latitude of Leningrad. How far does it deviate from the vertical? 
We solve the equation 


Q=¢+ 29,9) 
by the following approach, taking Q < 1. We set (Figure 109) 
Q=Q,+Q), 


where Q,(0) = Q,(0) = Oand Q, = Q,(0) + gr?/2. For Q,, we then get 
F 2 2 1? 
Q2 = 2e,.0]+ 007) Q.~ Fleas Tha} a-%. 


Q, (0) 2 
UN 


Q, (t) 


Figure 109 Displacement of a falling stone by Coriolis force 
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From this it is apparent that the stone lands about 
2t 2-7 
3 [RII cos 4 = ear 2509: 10°5-4m > 4cm 


to the east. 


Pros_eM. By how much would the Coriolis force displace a missile fired vertically upwards at 
Leningrad from falling back onto its launching pad, if the missile rose | kilometer? 


EXAMPLE 2 (The Foucault pendulum). Consider small oscillations of an ideal pendulum, taking 
into account the Coriolis force. Let e,, e,, and e, be the axes of a coordinate system associated 
to the earth, with e, directed upwards, and e, and e, in the horizontal plane (Figure 110). In 


Q 


x 


Figure 110 Coordinate system for studying the motion of a Foucault pendulum 


the approximation of small oscillations, z = 0 (in comparison with x and y); therefore, the 
horizontal component of the Coriolis force will be 2myQ,e, — 2mxQ,e,. From this we get the 
equations of motion 
¥ = —w*x + 2yQ,, (Q, = |Q| sin A,, where A, is the latitude) 
§ = —w7y — 2xQ,, 
If we set x + iy = w, then W = x + iy, Ww = X + ij, and the two equations reduce to one 
complex equation 


Ww + i2Q,W + ww = 0. 


We solve it: w = e“, A? + 210.4 + w? = 0,4 = —iQ, + i,/Q? + w*. But Q? « w?. Therefore, 
(22 + wm? = w + 0(2), from which it follows, by disregarding 2, that 


Ax —iQ, + io 
or, to the same accuracy, 


w= e MIC eit + ce i), 


For Q, = 0 we get the usual harmonic oscillations of a spherical pendulum. We see that the 
effect of the Coriolis force reduces to a rotation of the whole picture with angular velocity —Q,, 
where |Q,| = |Q| sin Ay. 

In particular, if the initial conditions correspond to a planar motion ()(0) = y(0) = 0), then 
the plane of oscillation will be rotating with angular velocity —Q, with respect to the earth's 
coordinate system (Figure 111). 

At a pole, the plane of oscillation makes one turn in a twenty-four-hour day (and is fixed 
with respect to a coordinate system not rotating with the earth). At the latitude of Moscow (56°) 
the plane of oscillation turns 0.83 of a rotation in a twenty-four-hour day, i.e., 12.5° in an hour. 
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Figure 111 Trajectory of a Foucault pendulum 


Pros_eo. A river flows with velocity 3 km/hr. For what radius of curvature of a river bend is the 
Coriolis force from the earth’s rotation greater than the centrifugal force determined by the flow 
of the river? 

Answer. The radius of curvature must be least on the order of 10 km for a river of medium 
width. 

The solution of this problem explains why a large river in the northern hemisphere (for 
example, the Volga in the middle of its course), undermines the base of its right bank, while a 
river like the Moscow River, with its abrupt bends of small radius, undermines either the left or 
right (whichever is outward from the bend) bank. 


28 Rigid bodies 


In this paragraph we define a rigid body and its inertia tensor, inertia ellipsoid, moments of 
inertia, and axes of inertia. 


A The configuration manifold of a rigid body 


Definition. A rigid body is a system of point masses, constrained by holonomic 
relations expressed by the fact that the distance between points is constant: 


(1) |x; — x;| = rjj = const. 


Theorem. The configuration manifold of a rigid body is a six-dimensional 
manifold, namely, R? x SO(3) (the direct product of a three-dimensional 
space R? and the group SO(3) of its rotations), as long as there are three 
points in the body not in a straight line. 


ProoF. Let x,, X,, and x; be three points of the body which do not lie in a 
straight line. Consider the right-handed orthonormal frame whose first 
vector is in the direction of x, — x,, and whose second is on the x; side in the 
X,X,X3-plane (Figure 112). It follows from the conditions |x; — x,| = ri; 
(i = 1, 2, 3), that the positions of all the points of the body are uniquely 
determined by the positions of x,, x,, and x3, which are given by the position 
of the frame. Finally, the space of frames in R? is R* x SO(3), since every 
frame is obtained from a fixed one by a rotation and a translation.*® O 


48 Strictly speaking, the configuration space of a rigid body is R* x O(3), and R® x SO(3) is 
only one of the two connected components of this manifold, corresponding to the orientation of 
the body. 
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e€ 
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Figure 112 Configuration manifold of a rigid body 


PRosLeM. Find the configuration space of a rigid body, all of whose points lie on a line. 


ANSWER. R? x S?. 


Definition. A rigid body with a fixed point O is a system of point masses con- 
strained by the condition x, = O in addition to conditions (1). 


Clearly, its configuration manifold is the three-dimensional rotation 
group SO(3). 


B Conservation laws 


Consider the problem of the motion ofa free rigid body under its own inertia, 
outside of any force field. For an (approximate) example we can use the 
rolling of a spaceship. 

The system admits all translational displacements: they do not change 
the lagrangian function. By Noether’s theorem there exist three first integrals: 
the three components of the vector of momentum. Therefore, we have shown 


Theorem. Under the free motion of a rigid body, its center of mass moves 
uniformly and linearly. 


Now we can look at an inertial coordinate system in which the center of 
inertia is stationary. Then we have 


Corollary. A free rigid body rotates about its center of mass as if the center of 
mass were fixed at a stationary point O. 


In this way, the problem is reduced to the problem, with three degrees of 
freedom, of the motion of a rigid body around a fixed point O. We will study 
this problem in more detail (not necessarily assuming that O is the center of 
mass of the body). 

The lagrangian function admits all rotations around O. By Noether’s 
theorem there exist three corresponding first integrals: the three components 
of the vector of angular momentum. The total energy of the system, E = T, 
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is also conserved (here it is equal to the kinetic energy). Therefore, we have 
shown 


Theorem. In the problem of the motion of a rigid body around a stationary point 
O, in the absence of outside forces, there are four first integrals: M,, M,, 
M,, and E. 


From this theorem we can get qualitative conclusions about the motion 
without any calculation. 

The position and velocity of the body are determined by a point in the 
six-dimensional manifold TSO(3)—the tangent bundle of the configuration 
manifold SO(3). The first integrals M,, M,, M,, and E are four functions on 
TSO(3). One can verify that in the general case (if the body does not have any 
particular symmetry) these four functions are independent. Therefore, the 
four equations 


M,=C,; M,=C, M,=C; E=C,>0 


define a two-dimensional submanifold V, in the six-dimensional manifold 
TSO(3). 

This manifold is invariant: if the initial conditions of motion give a point 
on V,, then for all time of the motion, the point in TSO(3) corresponding to 
the position and velocity of the body remains in V,. 

Therefore, V. admits a tangent vector field (namely, the field of velocities 
of the motion on TSO(3)); for C, > 0 this field cannot have singular points. 
Furthermore, it is easy to verify that V, is compact (using EF) and orientable 
(since TSO(3) is orientable).*? 

In topology it is proved that the only connected orientable compact two- 
dimensional manifolds are the spheres with n handles, n > 0 (Figure 113). 
Of these, only the torus (n = 1) admits a tangent vector field without singular 
points. Therefore, the invariant manifold V, is a two-dimensional torus (or 
several tori). 

We will see later that one can choose angular coordinates g,, ~, (mod 27) 
on this torus such that a motion represented by a point of V, is given by the 
equations @, = @,(c), @2 = @2(c). 


4° The following assertions are easy to prove: 


1. Let f,,...,f,: M — R be functions on an oriented manifold M. Consider the set V given by 
the equations f,; = c,,...,f, = ¢,. Assume that the gradients of f,,...,f, are linearly 
independent at each point. Then V is orientable. 

2. The direct product of orientable manifolds is orientable. 

3. The tangent bundle TSO(3) is the direct product R* x SO(3). A manifold whose tangent 
bundle is a direct product is called parallelizable. The group SO(3) (like every Lie group) is 
parallelizable. 

4. A parallelizable manifold is orientable. 


It follows from assertions 1-4 that SO(3), TSO(3), and V, are orientable. 
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Cr@ CCS 


Figure 113. Two-dimensional compact connected orientable manifolds 


In other words, a rotation of a rigid body is represented by the super- 
position of two periodic motions with (usually) different periods: if the 
frequencies w, and w, are non-commensurable, then the body never returns 
to its original state of motion. The magnitudes of the frequencies w, and w, 
depend on the initial conditions C. 


C The inertia operator>° 


We now go on to the quantitative theory and introduce the following 
notation. Let k be a stationary coordinate system and K a coordinate system 
rotating together with the body around the point O: in K the body is at rest. 


WwW 


O 


Figure 114 Radius vector and vectors of velocity, angular velocity and angular 
momentum of a point of the body in space 


Every vector in K is carried over to k by an operator B. Corresponding 
vectors in K and k will be denoted by the same letter; capital for K and lower 
case for k. So, for example (Figure 114), 
q€k is the radius vector of a point in space; 
Qe K is its radius vector in the body, q = BQ; 
v = qe k is the velocity vector of a point in space; 
V eK is the same vector in the body, v = BV; 
@€k is the angular velocity in space; 
Qe K is the angular velocity in the body, o = BQ; 
mek is the angular momentum in space; 
Me K is the angular momentum in the body, m = BM. 
Since the operator B: K > k preserves the metric and orientation, it 
preserves the scalar and vector products. 


5° Often called the inertia tensor (translator’s note). 
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By definition of angular velocity (Section 26), 


v= [, q]. 
By definition of the angular momentum of a point of mass m with respect 
to O, 
m = [4, mv] = mq, [o, q]]. 


Therefore, 
M = nfQ, [Q, QT]. 
Hence, there is a linear operator transforming 2 to M: 
A:K->K AQ =M. 
This operator still depends on a point of the body (Q) and its mass (mm). 


Lemma. The operator A is symmetric. 


ProoF. In view of the relation ([a, b], c) = ([c, a], b) we have, for any X and 
Y in K, 


(AX, Y) = m(LQ, [X, Q]], Y) = m(LY, Q], [X, Q]), 


and the last expression is symmetric in X and Y. O 


By substituting the vector of angular velocity Q for X and Y and noticing 
that [Q, Q]? = V? = Vv’, we obtain 


Corollary. The kinetic energy of a point of a body is a quadratic form with 
respect to the vector of angular velocity Q, namely: 


T = (AQ, Q) = KM, Q). 


The symmetric operator A is called the inertia operator (or tensor) of the 
point Q. 

If a body consists of many points Q; with masses m;, then by summing we 
obtain 


Theorem. The angular momentum M of a rigid body with respect to a stationary 
point O depends linearly on the angular velocity Q, i.e., there exists a linear 
operator A:K > K, AQ = M. The operator A is symmetric. 

The kinetic energy of a body is a quadratic form with respect to the angular 
velocity Q, 
T = #(AQ, Q) = 4M, Q). 


ProorF. By definition, the angular momentum of a body is equal to the sum 
of the angular momenta of its points: 


M=)M,=). 42 = AQ, — where A =) Aj. 
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Since by the lemma the inertia operator A; of every point is symmetric, 
the operator A is also symmetric. For kinetic energy we obtain, by definition, 


T =) T= 2 XM, Q) = 2M, Q) = AQ, Q). Oo 


D Principal axes 


Like every symmetric operator, A has three mutually orthogonal char- 
acteristic directions. Let e,, e,, and e; € K be their unit vectors and I,, I, 
and J their eigenvalues. In the basis e;, the inertia operator and the kinetic 
energy have a particularly simple form: 


M; = 1,Q; 


The axes e; are called the principal axes of the body at the point O. 

Finally, if the numbers J,, J, and I are not all different, then the axes e; 
are not uniquely defined. We will further clarify the meaning of the eigen- 
values I,, I,, and I3. 


Theorem. For a rotation of a rigid body fixed at a point O, with angular velocity 
Q = Qe (Q = |Q|) around the e axis, the kinetic energy is equal to 


T = 41,Q7, where I, = ¥ m;r? 
and r; is the distance of the i-th point to the e axis (Figure 115). 


Q = Qe 


Figure 115 Kinetic energy of a body rotating around an axis 
Proor. By definition T = $Y. m;v?; but |v,| = Qr;, so T = 1). mir7)Q?. 
The number J, depends on the direction e of the axis of rotation Q in the 
body. 
Definition. [, is called the moment of inertia of the body with respect to the 


e axis: 
2 
I, = S MT; . 
i 
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By comparing the two expressions for T we obtain: 


Corollary. The eigenvalues I; of the inertia operator A are the moments of 
inertia of the body with respect to the principal axes e;. 


E The inertia ellipsoid 


In order to study the dependence of the moment of inertia I, upon the direc- 
tion of the axis e in a body, we consider the vectors e/,/I,, where the unit 
vector e runs over the unit sphere. 


Theorem. The vectors e/,/T. form an ellipsoid in K. 


Proor. If Q = e/,/T,, then the quadratic form T = }(AQ, Q) is equal to 4. 
Therefore, {Q} is the level set of a positive-definite quadratic form, ie. an 
ellipsoid. | 


One could say that this ellipsoid consists of those angular velocity vectors 
Q whose kinetic energy is equal to 4. 


Definition. The ellipsoid {Q.: (AQ, Q) = 1} is called the inertia ellipsoid of the 
body at the point 0 (Figure 116). 


Body 


Ellipsoid of inertia 


Figure 116 Ellipsoid of inertia 


In terms of the principal axes e,, the equation of the inertia ellipsoid has 
the form 
1,9? + 1,03 + 1,03 — 1. 


Therefore the principal axes of the inertia ellipsoid are directed along the 
principal axes of the inertia tensor, and their lengths are inversely proportional 


to \/l;. 


Remark. If a body is stretched out along some axis, then the moment of 
inertia with respect to this axis is small, and consequently, the inertia el- 
lipsoid is also stretched out along this axis; thus, the inertia ellipsoid may 
resemble the shape of the body. 

If a body has an axis of symmetry of order k passing through O (so that it 
coincides with itself after rotation by 2z/k around the axis), then the inertia 
ellipsoid also has the same symmetry with respect to this axis. But a triaxial 
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ellipsoid does not have axes of symmetry of order k > 2. Therefore, every axis 
of symmetry of a body of order k > 2 is an axis of rotation of the inertia 
ellipsoid and, therefore, a principal axis. 


EXAMPLE. The inertia ellipsoid of three points of mass m at the vertices of an equilateral triangle 
with center O is an ellipsoid of revolution around an axis normal to the plane of the triangle 
(Figure 117). 


Figure 117 Ellipsoid of inertia of an equilateral triangle 


If there are several such axes, then the inertia ellipsoid is a sphere, and any 
axis is principal. 


PROBLEM. Draw the line through the center of a cube such that the sum of the squares of its 
distances from the vertices of the cube is: (a) largest, (b) smallest. 


We now remark that the inertia ellipsoid (or the inertia operator or the 
moments of inertia I,, I,, and I,;) completely determines the rotational 
characteristics of our body: if we consider two bodies with identical inertia 
ellipsoids, then for identical initial conditions they will move identically (since 
they have the same lagrangian function L = T). 

Therefore, from the point of view of the dynamics of rotation around 0, 
the space of all rigid bodies is three-dimensional, however many points com- 
pose the body. 

We can even consider the “solid rigid body of density p(Q),” having in 
mind the limit as AQ —> 0 of the sequence of bodies with a finite number of 
points Q; with masses p(Q;)AQ; (Figure 118) or, what amounts to the same 
thing, any body with moments of inertia 


1.= |{{oc@rqua, 


where r is the distance from Q to the e axis. 


Figure 118 Continuous solid rigid body 
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ExamPLe. Find the principal axes and moments of inertia of the uniform planar plate |x| < a, 
ly| < b, z = 0 with respect to O. 

Solution. Since the plate has three planes of symmetry, the inertia ellipsoid has the same planes 
of symmetry and, therefore, principal axes x, y, and z. Furthermore, 


a b ma? 
=f i) x7p dx dy = —. 
bay 3 


Pee 


In the same way 


Clearly, I, = I, + I,. 


Pros_em. Show that the moments of inertia of any body satisfy the triangle inequalities 
I,<1,+], I,<1;+1, and I, <1,+1,. 


and that equality holds only for a planar body. 


PROBLEM. Find the axes and moments of inertia of a homogeneous ellipsoid of mass m with 
semiaxes a, b, and c relative to the center O. 
Hint. First look at the sphere. 


PROBLEM. Prove Steiner’s theorem: The moments of inertia of any rigid body 
relative to two parallel axes, one of which passes through the center of mass, 
are related by the equation 

I =I) + mr’, 


where m is the mass of the body, r is the distance between the axes, and I 
is the moment of inertia relative to the axis passing through the center of 
mass. 


Thus the moment of inertia relative to an axis passing through the center 
of mass is less than the moment of inertia relative to any parallel axis. 


PROBLEM. Find the principal axes and moments of inertia of a uniform tetrahedron relative to 
its vertices. 


ProsLem. Draw the angular momentum vector M for a body with a given inertia ellipsoid 
rotating with a given angular velocity Q. 


ANSWER. M is in the direction normal to the inertia ellipsoid at a point on the Q axis (Figure 119). 


2 


Figure 119 Angular velocity, ellipsoid of inertia and angular momentum 
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Figure 120 Behavior of moments of inertia as the body becomes smaller 


Pros_em. A piece is cut off a rigid body fixed at the stationary point O. How are the principal 
moments of inertia changed? (Figure 120). 


ANSWER. All three principal moments are decreased. 
Hint. Cf. Section 24. 


PRosLeo. A small mass ¢ is added to a rigid body with moments of inertia ], > I, > I at the 
point Q = x,e; + x,e, + x33. Find the change in J, and e, with error O(e’). 

Solution. The center of mass is displaced by a distance of order ¢. Therefore, the moments of 
inertia of the old body with respect to the parallel axes passing through the old and new centers 
of mass differ in magnitude of order e?. At the same time, the addition of mass changes the 
moment of inertia relative to any fixed axis by order ¢. Therefore, we can disregard the displace- 
ment of the center of mass for calculations with error O(¢?). 

Thus, after addition of a small mass the kinetic energy takes the form 


T = Ty + 3e[Q, Q]? + O(e7), 


where Ty = $(1,Q? + 1,03 + 1,93) is the kinetic energy of the original body. We look for the 
eigenvalue I,(e) and eigenvector e,(e) of the inertia operator in the form of a Taylor series in . 
By equating coefficients of ¢ in the relation A(e)e,(e) = I,(e)e,(¢), we find that, within error 
O(e?): 


I,(e) © 1, + ex? + x3) and e,(e) ye, + ( a e+ a a). 

I,-], I;—-J, 
From the formula for J,(¢) it is clear that the change in the principal moments of inertia (to the 
first approximation in ¢) is as if neither the center of mass nor the principal axes changed. The 
formula for e,(e) demonstrates how the directions of the principal axes change: the largest 
principal axis of the inertia ellipsoid approaches the added point, and the smallest recedes from 
it. Furthermore, the addition of a small mass on one of the principal planes of the inertia 
ellipsoid rotates the two axes lying in this plane and does not change the direction of the third 
axis. The appearance of the differences of moments of inertia in the denominator is connected 
with the fact that the major axes of an ellipsoid of revolution are not defined. If the inertia 
ellipsoid is nearly an ellipsoid of revolution (i.e., 1, ~ J) then the addition of a small mass could 
strongly turn the axes e, and e, in the plane spanned by them. 


29 Buler’s equations. Poinsot’s description of the motion 


Here we study the motion of a rigid body around a stationary point in the absence of outside 
forces and the similar motion of a free rigid body. The motion turns out to have two frequencies. 


A Euler’s equations 


Consider the motion of a rigid body around a stationary point O. Let M be 
the angular momentum vector of the body relative to O in the body, &2 the 
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angular velocity vector in the body, and A the inertia operator (AQ = M); 
the vectors Q and M belong to the moving coordinate system K (Section 26). 
The angular momentum vector of the body relative to O in space, m = BM, 
is preserved under the motion (Section 28B). 

Therefore, the vector M in the body (M € K) must movesothatm = B,M(t) 
does not change when t changes. 


Theorem 


(1) a = (M, 2) 


Proor. We apply formula (5), Section 26 for the velocity of the motion of 
the “point” M(t) € K with respect to the stationary space k. We get 
th = BM + [o, m] = B(M + [Q, M)). 


. But since the angular momentum m with respect to the space is preserved 
(rh = 0), M + [Q, M] = 0. O 


Relation (1) is called the Euler equations. Since M = AQ, (1) can be 
viewed as a differential equation for M (or for Q). If 
Q = Oe, + Q,e, + Q3e, and M = Mie, + Me, + M3e, 
are the decompositions of 2 and M with respect to the principal axes at O, 
then M; = I;,; and (1) becomes the system of three equations 


dM dM "dM 
(2) a= MMs 7  MaMy 7 a3 MiM2, 
where a, = (12 — 13)/I213,a2 = U3 — 1,)/I31,,anda3 = (I, — 12)/I,1,, 08, 
in the form of a system of three equations for the three components of the 
angular velocity, 
a0, 
dt 
dQ 
In 7 = (Is — 193%, 


I, = (I, — 13)Q,Q3, 


dQ 
. = (I, as 1,)Q,Q). 


Remark. Suppose that outside forces act on the body, the sum of whose: 
moments with respect to O is equal to n in the stationary coordinate system 
and N in the moving system (n = BN). Then 

m=n 
and the Euler equations take the form 


dM. 
— = N. 
it [M, 2] + 
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B Solutions of the Euler equations 
Lemma. The Euler equations (2) have two quadratic first integrals 
M2 M2 M2 : 
2E =—++—2+—3 and M? = M? + M3 + M3. 


Proor. E is preserved by the law of conservation of energy, and M? by the 
law of conservation of angular momentum m, since m? = M? = M*. 


Thus, M lies in the intersection of an ellipsoid and a sphere. In order to 
study the structure of the curves of intersection we will fix the ellipsoid 
E > Oand change the radius M of the sphere (Figure 121). 


Figure 121 Trajectories of Euler’s equation on an energy level surface 


We assume that J, > I, > 1,. The semiaxes of the ellipsoid will be 


/2EI, > ./2EI, > ./2EI,. If the radius M of the sphere is less than the 
smallest semiaxes or larger than the largest (M < ./2EI, or M > ,/2E]I,), 
then the intersection is empty, and no actual motion corresponds to such 
values of E and M. If the radius of the sphere is equal to the smallest semi- 
axes, then the intersection consists of two points. Increasing the radius, so 
that J2EI 3 <M < ./2EI,, we get two curves around the ends of the small- 
est semiaxes. In exactly the same way, if the radius of the sphere is equal 
to the largest semiaxes we get their ends, and if it is a little smaller we get 
two closed curves close to the ends of the largest semiaxes. Finally, if 
M = ./2EI,, the intersection consists of two circles. 

Each of the six ends of the semiaxes of the ellipsoid is a separate trajectory 
of the Euler equations (2)—a stationary position of the vector M. It corre- 
sponds to a fixed value of the vector of angular velocity directed along one 
of the principal axes e;; during such a motion, remains collinear with M. 
Therefore, the vector of angular velocity retains its position @ in space 
collinear with m: the body simply rotates with fixed angular velocity around 
the principal axis of inertia e,, which is stationary in space. 
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Definition. A motion of a body, under which its angular velocity remains 
constant (@ = const, Q = const) is called a stationary rotation. 


We have proved: 


Theorem. A rigid body fixed at a point O admits a stationary rotation around 
any of the three principal axes e,, €,, and e3. 


If, as we assumed, I, > I, > I3, then the right-hand side of the Euler 
equations does not become 0 anywhere else, i.e., there are no other stationary 
rotations. 

We will now investigate the stability (in the sense of Liapunov) of solu- 
tions to the Euler equations. 


Theorem. The stationary solutions M = M,e, and M = Me; of the Euler 
equations corresponding to the largest and smallest principal axes are 
stable, while the solution corresponding to the middle axis (M = M,e,) 
is unstable. 


PRooF. For a small deviation of the initial condition from M,e, or M3e;, 
the trajectory will be a small closed curve, while for a small deviation from 
Me, it will be a large one. | 


PRoBLeM. Are stationary rotations of the body around the largest and smallest principal axes 
Liapunov stable? 


ANSWER. No. 


C Poinsot’s description of the motion 


It is easy to visualize the motion of the angular momentum and angular 


velocity vectors in a body (M and 2)—they are periodic if M # ./2E]I;. 
In order to see how a body rotates in space, we look at its inertia ellipsoid. 


E = {Q:(AQ, Q) = 1} < K, 


where A: 92— M is the symmetric operator of inertia of the body fixed 
at O. 

At every moment of time the ellipsoid E occupies a position B,E in the 
stationary space k. 


Theorem (Poinsot). The inertia ellipsoid rolls without slipping along a station- 
ary plane perpendicular to the angular momentum vector m (Figure 122). 


Proor. Consider a plane 2 perpendicular to the momentum vector m and 
tangent to the inertia ellipsoid B,E. There are two such planes, and at the 
point of tangency the normal to the ellipsoid is parallel to m. 
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Figure 122 Rolling of the ellipsoid of inertia on the invariable plane 


But the inertia ellipsoid E has normal grad(AQ, Q) = 2AQ = 2M at the 
point Q. Therefore, at the points +& = w/,/2T of the » axis, the normal to 
B,E is collinear with m. 

So the plane z is tangent to B,E at the points +¢ on the instantaneous 
axis of rotation. But the scalar product of € with the stationary vector m is 
equal to + (1/./2T)(m, @) = +,/2T , and is therefore constant. So the 
distance of the plane z from O does not change, i.e., z is stationary. 

Since the point of tangency lies on the instantaneous axis of rotation, its 
velocity is equal to zero. This implies that the ellipsoid B,E rolls without 
slipping along z. O 


Translator’s remark: The plane z is sometimes called the invariable plane. 


Corollary. Under initial conditions close to a stationary rotation around the 
large (or small) axis of inertia, the angular velocity always remains close 
to its initial position, not only in the body (Q) but also in space (a). 


We now consider the trajectory of the point of tangency in the stationary 
plane z. When the point of tangency makes an entire revolution on the ellip- 
soid, the initial conditions are repeated except that the body has turned 
through some angle « around the m axis. The second revolution will be 
exactly like the first; if « = 22(p/q), the motion is completely periodic; if 
the angle is not commensurable with 22, the body will never return to its 
initial state. 

In this case the trajectory of the point of tangency is dense in an annulus 
with center O’ in the plane (Figure 123). 


ProsLeM. Show that the connected components of the invariant two- 
dimensional manifold V, (Section 28B) in the six-dimensional space TSO(3) 
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Figure 123 Trajectory of the point of contact on the invariable plane 


are tori, and that one can choose coordinates g, and @, mod 2z on them so 
that P1 = @,(c) and @2 = @,(c). 
Hint. Take the phase of the periodic variation of M as @,. 


We now look at the important special case when the inertia ellipsoid is 
an ellipsoid of revolution: 


I,=13, 41,. 


In this case the axis of the ellipsoid B,e,, the instantaneous axis of rotation 
@, and the vector m always lie in one plane. The angles between them and the 
length of the vector @ are preserved; the axes of rotation (m) and symmetry 
(B,e,) Sweep out cones around the angular momentum vector m with the 
same angular velocity (Figure 124). This motion around m is called pre- 
cession. 


PROBLEM. Find the angular velocity of precession. 
ANSWER. Decompose the angular velocity vector @ into components in the directions of the 


angular momentum vector mand the axis of the body B,e,. The first component gives the angular 
velocity of precession, w,, = M/I,. 


Figure 124 Rolling of an ellipsoid of revolution on the invariable plane 
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Hint. Represent the motion of the body as the product of a rotation around the axis of 
momentum and a subsequent rotation around the axis of the body. The sum of the angular 
velocity vectors of these rotations is equal to the angular velocity vector of the product. 

Remark. In the absence of outside forces, a rigid body fixed at a point O is represented by a 
lagrangian system whose configuration space is a group, namely SO(3), and the lagrangian 
function is invariant under left translations. One can show that a significant part of Euler’s theory 
of rigid body motion uses only this property and therefore holds for an arbitrary left-invariant 
lagrangian system on an arbitrary Lie group. In particular, by applying this theory to the group 
of volume-preserving diffeomorphisms of a domain D in a riemannian manifold, one can obtain 
the basic theorems of the hydrodynamics of an ideal fluid. (See Appendix 2.) 


30 Lagrange’s top 


We consider here the motion of an axially symmetric rigid body fixed at a stationary point ina 
uniform force field. This motion is composed of three periodic processes: rotation, precession, 
and nutation. 


A Euler angles 


Consider a rigid body fixed at a stationary point O and subject to the action 
of the gravitational force mg. The problem of the motion of such a “heavy 
rigid body” has not yet been solved in the general case and in some sense is 
unsolvable. 

In this problem with three degrees of freedom, only two first integrals 
are known: the total energy E = T + U, and the projection M, of the 
angular momentum on the vertical. There is an important special case in 
which the problem can be completely solved—the case of a symmetric top. A 
symmetric or lagrangian top is a rigid body fixed at a stationary point O 
whose inertia ellipsoid at O is an ellipsoid of revolution and whose center of 
gravity lies on the axis of symmetry e3 (Figure 125). In this case, a rotation 


Figure 125 Lagrangian top 


around the e, axis does not change the lagrangian function, and by Noether’s 
theorem there must exist a first integral in addition to E and M, (as we will 
see, it turns out to be the projection M; of the angular momentum vector on 
the e, axis). 

If we can introduce three coordinates so that the angles of rotation around 
the z axis and around the axis of the top are among them, then these co- 
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ordinates will be cyclic, and the problem with three degrees of freedom will 
reduce to a problem with one degree of freedom (for the third coordinate). 

Such a choice of coordinates on the configuration space SO(3) is possible; 
these coordinates g, w, 0 are called the Euler angles and form a local co- 
ordinate system in SO(3) similar to geographical coordinates on the sphere: 
they exclude the poles and are multiple-valued on one meridian. 


C3 A 


Axis of 
the top 


Projection of 
the top’s axis 


Horizontal plane 


Nodal line 


Figure 126 Euler angles 


We introduce the following notation (Figure 126): 


e,,e,, and e, are the unit vectors of a right-handed cartesian stationary 
coordinate system at the stationary point O; 

e,,e,,ande,; are the unit vectors of a right moving coordinate system 
connected to the body, directed along the principal axes at O; 

I, = 1, # 13 are the moments of inertia of the body at O; 

ey is the unit vector of the axis [e, , e3 ], called the “line of nodes” 
(all vectors are in the “stationary space” k). 


In order to carry the stationary frame (e,, €,, e,) into the moving frame 
(e,, €2, €3), we must perform three rotations: 


1. Through an angle ¢ around the e, axis. Under this rotation, e, remains 
fixed, and e, goes to ey. 

2. Through an angle 6 around the ey axis. Under this rotation, e, goes to 
e;, and ey remains fixed. 

3. Through an angle w around the e, axis. Under this rotation, ey goes to 
e,, and e; stays fixed. 


After all three rotations, e, has gone to e,, and e, to e;; therefore, e, 
goes to e,. 
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The angles ¢, w, and 6 are called the Euler angles. It is easy to prove: 


Theorem. To every triple of numbers @, 0, W the construction above associates 
a rotation of three-dimensional space, B(, 6, )€SO(3), taking the 
frame (e,,@,,@,) into the frame (e,,€2,€3). In addition, the- mapping 
(9, 6, W) > B(g, 8, W) gives local coordinates 


0<go<2n O0<w<2n 0<0<2 


on SO(3), the configuration space of the top. Like geographical longitude, 
g and W can be considered as angles mod 27; for 0 = 0 or 6 = x the map 
(9, 9, W) > B has a pole-type singularity. 


B Calculation of the lagrangian function 


We will express the lagrangian function in terms of the coordinates 9, 6, 
and their derivatives. 
The potential energy, clearly, is equal to 


U= {zs dm = mgzy) = mglcos 8, 


where z, is the height of the center of gravity above 0 (Figure 125). 
We now calculate the kinetic energy. A small trick is useful here: we 
consider the particular case when 9 = W = 0. 


Lemma. The angular velocity of a top is expressed in terms of the derivatives 
of the Euler angles by the formula 
@ = be, + (@sin Oe, + (Wt @ cos O)e3, 
ifo=w=0. 


Proor. We look at the velocity of a point of the top occupying the position 
r at time t. After time dt this point takes the position (within (dr)”) 


Bip + do, 6 + dO, + dy)B-*(¢, 6, Wr, 


where dg = @ dt, d0 = 6 dt and dy = "7 dt. 
Consequently, to the same accuracy the displacement vector is the sum 
of the three terms 


Bio + dg, 0, WB (g, 6, Wr — r = [a,, rJdt, 
B(g, 8 + dO, W)B-*(Q, 0, Wr — r = [@o, rt, 
B(g, 6, W + dW)B- *(9, 6, Wr — r = [@,, rdt 


(the angular velocities w,, @», and w, are defined by these formulas). 
Therefore, the velocity of the point r is v = [wy + @, + @,,r], so the 
angular velocity of the body is 


© = @, + M + Oy, 


where the terms are defined by the formulas above. 
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It remains to decompose the vectors @,, @», and w, with respect to 
e€,,€,, and e,. We have not yet used the fact thato = yw =O. Ifo =p =0, 
then 

By + dg, 6, WB (9, 6, W) 
is simply a rotation around the axis e, through an angle dq, so 
@, = Ge,. 
Furthermore, B(y, 0 + d0, ¥)B~'(@, 6, W) is simply a rotation around the 
axis €@y = €, = e, through an angle d@ in the case p = W = 0,s0 
Os = be,. 
Finally, B(g, 6, ~ + dW)B™+(g, 6, ) is a rotation through an angle dy 
around the axis e3, so : 
@, = Wes. 
In short, for 9 = W = 0 we have 
@ = ge, + de, + we. 
But, clearly, for gp = w = 0 
e, = e, cos 8 + e, sin 8. 
So the components of the angular velocity along the principal axes e,, e,, 


and e; are 
o,=60 @,=@sind w,= + ¢cos#. O 


Since T = $(1,@? + I,@3 + I~), the kinetic energy for g = f = 0 is 
given by the formula 


I ; 
T = + g? sin? 8) +2 + @cos 6)?. 


But the kinetic energy cannot depend on g and w: these are cyclic co- 
ordinates, and by a choice of origin of reference for g and W which does not 
change T we can always make gy = 0 and w = 0. Thus the formula we got 
for the kinetic energy is true for all g and y. 

In this way we obtain the lagrangian function 


L= 2@ + @? sin? 6) + 2 + @cos 0)? — mglcos 6. 


C Investigation of the motion 

To the cyclic coordinates @ and w there correspond the first integrals 
OL : . 

a@ = M, = @(I, sin? 6 + I, cos? 6) + WI; cos 0 


éL : . 
ay M,; = gl, cos 6 + WI. 
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Theorem. The inclination 0 of the axis of the top to the vertical changes with 
time in the same way as in the one-dimensional system with energy 


B= 16 + Usq(O), 


where the effective potential energy is given by the formula 


_ (M, — M3 cos 6)? 


U 
ss 21, sin? 0 


+ mglcos 6. 


Proor. Following the general theory, we express @ and w in terms of M, 
and M,. We get the total energy of the system as 


I, ., M3 (M, — M; cos 6)? 
E=20+4+ i eae ieee 
5 + I, + mgl cos 0 + sin? 6 
and 
. _M,—M,cos0 
~ ‘7, sin? 6 


The number M3/2], = E — E’, independent of 0, does not affect the 
equation for 6. Oo 


In order to study the one-dimensional system above it is convenient to 
make the substitution cos @ = u(—1 <u < 1). 
We also write 


M, Ms _ b 2E° 2mgl 


ead =B>0. 
io i ag ne 


Then we can rewrite the law of conservation of energy E’ as 


Ww = f(u), 


where f(u) = (a — Bu)(1 — u?) — (a — bu)*, and the law of variation of 
the azimuth @ as 


,  a—bu 
Dies 


We notice that f(u) is a polynomial of degree 3, f(+ 00) = +00, and 
f(£l) = -@F¥ by <0 if a ¥ +b. On the other hand, actual motions 
correspond to constants a, b, a, and f for which f(u) => 0 for some 
—1 <u <1. Thus f(u) has exactly two real roots u, and u, on the interval 
—1 <u <1 (and one for u > 1, Figure 127). Therefore, the inclination 0 
of the axis of the top changes periodically between two limit values 6, and 0, 
(Figure 128). This periodic change in inclination is called nutation. 
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Figure 127. Graph of the function f(u) 


We now consider the motion of the azimuth of the axis of the top. The 
point of intersection of the axis with the unit sphere moves in the ring between 
the parallels 6, and @,. The variation of the azimuth of the axis is determined 
by the equation 


,  a-—bu 
ace re 


If the root u’ of the equation a = bu lies outside of (u,, u2), then the angle @ 
varies monotonically and the axis traces a curve like a sinusoid on the unit 
sphere (Figure 128(a)). If the root u’ of the equation a = bu lies inside 
(u,, uz), then the rate of change of ¢ is in opposite directions on the parallels 
6, and 6,, and the axis traces a looping curve in the sphere (Figure 128(b)). 

If the root wu’ of a = bu lies on the boundary (e.g., u’ = u2), then the axis 
traces a curve with cusps (Figure 128(c)). 

The last case, although exceptional, is observed every time we release 
the axis of a top launched at inclination 6, without initial velocity; the top 
first falls, but then rises again. 

The azimuthal motion of the top is called precession. The complete 
motion of the top consists of rotation around its own axis, nutation, and 
precession. Each of the three motions has its own frequency. If the frequencies 
are incommensurable, the top never returns to its initial position, although 
it approaches it arbitrarily closely. 


(a) (b) (c) 
Figure 128 Path of the top’s axis on the unit sphere 
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31 Sleeping tops and fast tops 


The formulas obtained in Section 30 reduce the solution of the equations of motion of a top to 
elliptic integrals. However, qualitative information about the motion is usually easy to obtain 
without turning to quadrature. 

In this paragraph we investigate the stability of a vertical top and give approximate formulas 
for the motion of a rapidly spinning top. 


A Sleeping tops 

We consider first the particular solution of the equations of motion in 
which the axis of the top is always vertical (9 = 0) and the angular velocity 
is constant (a “sleeping” top). In this case, clearly, M, = M3 = 1303 
(Figure 129). 


’ 


Figure 129 Sleeping top 


PROBLEM. Show that a stationary rotation around the vertical axis is always Liapunov unstable. 


We will look at the motion of the axis of the top, and not of the top itself. 
Will the axis of the top stably remain close to the vertical, i.e., will @ remain 
small? Expressing the effective potential energy of the system 


_ (M, — M3 0s 6)? 


Uz - 
: 21, sin? @ 


+ mgl cos 0 


as a power series in 0, we find 


ON Diccnaniy” hee siete 


Ven = or 2 


If A > 0, the equilibrium position 0 = 0 of the one-dimensional system 
is stable, and if A < 0 it is unstable. Thus, the condition for stability has the 
form 

aes eu - 
T3 
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When friction reduces the velocity of a sleeping top to below this limit, the 
top wakes up. 


ProsLeM. Show that, for w3 > 4mgl1,/I4, the axis of a sleeping top is stable with respect to 
perturbations which change the values of M, and M3, as well as 0. 


B Fast tops 


A top is called fast if the kinetic energy of its rotation is large in comparison 
with its potential energy: 


51,03 > mgl. 


It is clear from a similarity argument that multiplying the angular velocity 
by N is exactly equivalent to dividing the weight by N?. 


Theorem. If, while the initial position of a top is preserved, the angular velocity 
is multiplied by N, then the trajectory of the top will be exactly the same as 
if the angular velocity remained as it was and the acceleration of gravity 
g were divided by N?. In the case of large angular velocity the trajectory 
clearly goes N times faster.>' 


In this way we can study the case g > 0 and apply the results to study 
the case w > 00. 

To begin, we consider the case g = 0, i.e., the motion of a symmetric 
top in the absence of gravity. We compare two descriptions of this motion: 
Lagrange’s (Section 30C) and Poinsot’s (Section 29C). 

We first consider Lagrange’s equation for the variation of the angle of 
inclination 6 of the top’s axis. 


Lemma. In the absence of gravity, the angle 05 satisfying M, = M3 cos 0 
is a stable equilibrium position of the equation of motion of the top’s axis. 
The frequency of small oscillations of @ near this equilibrium position is 
equal to 

1303 


Mnut — I 
1 


Proor. In the absence of gravity the effective potential energy reduces to 


(M, — M, cos 6)? 


Ver = - 
an 21, sin? 6 


This nonnegative function has the minimum value of zero for the angle 0 = 0) determined by 


the condition M, = M; cos 6) (Figure 130). Thus, the angle of inclination 09 of the top’s axis 


51 Denote by 9,(t, &) the position of the top at time ¢ with initial condition & € TSO(3) and 
gravitational acceleration g. Then the theorem says that 


Q(t, N&) = Pn-2,(Nt, §). 
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Vest 


% i 


Figure 130 Effective potential energy of a top 


to the vertical is stably stationary: for small deviations of the initial angle 6 from 99, there will 
be periodic oscillations of @ near 6) (nutation). The frequency of these oscillations is easily 
determined by the following general formula: the frequency w of small oscillations in a one- 
dimensional system with energy 


ax? 


E= a + U(x), U(x9) = min U(x) 


is given (Section 22D) by the formula 


u" 
aoe: (Xo) 
a 


The energy of the one-dimensional system describing oscillations of the inclination of the top’s 
axis iS 


I, 
= 6? 4 Ung. 
2 eff 


For 6 = 0) + x we find M, — M; cos 6 = M;(cos 0 — cos(9) + x)) = M3x sin 0) + O(x?) 


M2-x?-sin? 05 Bo 
UU... = ————_————_- fey x? = x? for, 
ot 21, sin? 05 Oe) 21, 


from which we obtain the expression for the frequency of nutation 


1303 
I, 


Onut = 


From the formula @ = (M, — M, cos 6)/I, sin? 6 it is clear that, for 
6 = 05, the azimuth of the axis does not change with time: the axis is 
stationary. The azimuthal motion of the axis under small deviations of 0 
from 6, could also be studied with the help of this formula, but we will deal 
with it differently. 

The motion of a top in the absence of gravity can be considered in 
Poinsot’s description. Then the axis of the top rotates uniformly around the 
angular momentum vector, preserving its position in space. Thus, the axis 
of the top describes a circle on the sphere whose center corresponds to the 
angular momentum vector (Figure 131). 


Remark. Now the motion of the top’s axis, which according to Lagrange was called nutation, 
is called precession in Poinsot’s description of motion. 
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31: Sleeping tops and fast tops 


Figure 131 Comparison of the descriptions of the motion of a top according to 
Lagrange and Poinsot 


This means that the formula obtained above for the frequency of a small 
nutation, yy, = 13@3/I,, agrees with the formula for the frequency of 
precession w = M/I, in Poinsot’s description: when the amplitude of 
nutation approaches zero, I,@3; > M. 


C A top ina weak field 


We go now to the case when the force of gravity is not absent, but is very 
small (the values of M, and Ms; are fixed). In this case a term mgl cos 0, 
small together with its derivatives, is added to the effective potential energy. 
We will show that this term slightly changes the frequency of nutation. 


Lemma. Suppose that the function f(x) has a minimum at x = 0 and Taylor expansion f(x) = 
Ax?/2 +...., A > 0. Suppose that the function h(x) has Taylor expansion h(x) = B + Cx +-+>. 
Then, for sufficiently small «, the function f(x) = f(x) + eh(x) has a minimum at the point 
(Figure 132) 


Ce + O(2) 
Tr €"), 
A 
which is close to zero. In addition, f{(x,) = A + O(e). 


Proor. We have f(x) = Ax + Ce + O(x?) + O(ex), and the result is obtained by applying the 
implicit function theorem to f(x). Oo 


eh (x) 
x 
Xe 


Figure 132 Displacement of the minimum under a small change of the function 
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6: Rigid bodies 


By the lemma, the effective potential energy for small g has a minimum 
6, close to 69, and at this point U” differs slightly from U"(8,). Therefore, the 
frequency of a small nutation near 9, is close to that obtained for g = 0: 
: I 
lim On, = = (3. 
g>0 I, 


D A rapidly thrown top 


We now consider the special initial conditions when we release the axis of 
the top without an initial push from a position with inclination 0) to the 
vertical. 


Theorem. If the axis of the top is stationary at the initial moment (@ = 6= 0) 
and the top is rotating rapidly around its axis (@3 — ©), which is inclined 
from the vertical with angle 09(M, = M3 cos 0), then asymptotically, as 
M3 _> oO, 


1. the nutation frequency is proportional to the angular velocity; 

2. the amplitude of nutation is inversely proportional to the square of the 
angular velocity; 

3. the frequency of precession is inversely proportional to the angular 
velocity; 

4. the following asymptotic formulas hold (as w3 — 00): 


T; I,mgl . mgl 
Onn ~ —@ Anut ~ 55 sin 8 prec ~ 
nut a 3 nut 20 0 prec 1; 3 


(here f(@3) ~ g(@s3) if lime, +0 (f/g) = 1). 


For the proof, we look at the case when the initial angular velocity is 
fixed, but g > 0. Then by interpreting the formulas with the aid of a similarity 
argument (cf. Section B), we obtain the theorem. 


We already know from Section 30C that under our initial conditions the axis of the top traces 
a curve with cusps on the sphere. 


Vert 


Sy 


oT) Oe : 


Figure 133 Definition of the amplitude of nutation 
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31: Sleeping tops and fast tops 


We apply the lemma to locate the minimum point 9, of the effective potential energy. We 
set (Figure 133) 


0=0,+x cos 86 = cos 09 — x sinfyo + ---. 
Then we obtain, as above, the Taylor expansion in x at 0) 


Bw 
Ueetlg=0 = Ai epee, mgl cos 6 = mgl cos 69 — xmgl sin 0) +---. 
1 


Applying the lemma to f = Uggl,=0,9 = & h = ml cos(@) + x), we find that the minimum of the 
effective potential energy U,+, is attained at angle of inclination 


I,ml sin 0 
0,= 00 +%  Xy=—=q_-G*'g + OG). 
1303 


Thus the inclination 6 of the top’s axis will oscillate near 0, (Figure 134). But, at the initial moment, 


Figure 134 Motion of a top’s axis 


0 = 6 and 6 = 0. This means that 6, corresponds to the highest position of the axis of the top. 
Thus, for small g, the amplitude of nutation is asymptotically equal to 
I,ml sin 09 


Anut ~ Xg ~ 5 > 0). 
nut 9 Ror g g ) 


We now find the precessional motion of the axis. From the general formula 
. _M, — M3 cos 6 
0 T sin? 6 
for M, = M, cos 0) and 0 = 6) + x, we find that M, — M3; cos 0 = M3x sin 0) +--+: so 


. M3 é 
= oe ee 
- T, sin 05 


But x oscillates harmonically between 0 and 2x, (up to O(g?)). Therefore, the average value of 
the velocity of precession over the period of nutation is asymptotically equal to 


= M; mgl 


~ 0). 
T, sind, 1,03 (9-9) 


PROBLEM. Show that 


tim tim 2O— 9O) _ 
g~ 010 Mmgl/I,@3 
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PART III 
HAMILTONIAN MECHANICS 


Hamiltonian mechanics is geometry in phase space. Phase space has the 
structure of a symplectic manifold. The group of symplectic diffeomorphisms 
acts on phase space. The basic concepts and theorems of hamiltonian 
mechanics (even when formulated in terms of local symplectic coordinates) 
are invariant under this group (and under the larger group of transformations 
which also transform time). 

A hamiltonian mechanical system is given by an even-dimensional mani- 
fold (the “phase space”), a symplectic structure on it (the “Poincaré integral 
invariant”) and a function on it (the “hamiltonian function”). Every one- 
parameter group of symplectic diffeomorphisms of the phase space pre- 
serving the hamiltonian function is associated to a first integral of the 
equations of motion. 

Lagrangian mechanics is contained in hamiltonian mechanics as a special 
case (the phase space in this case is the cotangent bundle of the configuration 
space, and the hamiltonian function is the Legendre transform of the lagrang- 
ian function). 

The hamiltonian point of view allows us to solve completely a series of 
mechanical problems which do not yield solutions by other means (for 
example, the problem of attraction by two stationary centers and the problem 
of geodesics on the triaxial ellipsoid). The hamiltonian point of view has 
even greater value for the approximate methods of perturbation theory 
(celestial mechanics), for understanding the general character of motion 
in complicated mechanical systems (ergodic theory, statistical mechanics) 
and in connection with other areas of mathematical physics (optics, quantum 
mechanics, etc.). 


Differential forms 


Exterior differential forms arise when concepts such as the work of a field 
along a path and the flux ofa fluid through a surface are generalized to higher 
dimensions. 

Hamiltonian mechanics cannot be understood without differential forms. 
The information we need about differential forms involves exterior multi- 
plication, exterior differentiation, integration, and Stokes’ formula. 

32 Exterior forms 


Here we define exterior algebraic forms 


A I-forms 


Let R" be an n-dimensional real vector space.*? We will denote vectors in this 
space by 6,9, .... 


Definition. A form of degree 1 (or a 1-form) is a linear function w: R" > R,ie., 
A,B, + A2G2) = ApoE) + A2(E2), Ay, 4, €R and §), § € R". 


We recall the basic facts about 1-forms from linear algebra. The set of all 
1-forms becomes a real vector space if we define the sum of two forms by 


(@, + @2)(§) = @,(§) + @2(8), 


and scalar multiplication by 
(Aw)(G) = Aw(6). 


52 Tt is essential to note that we do not fix any special euclidean structure on R”. In some examples 
we use such a structure; in these cases this will be specifically stated (“euclidean R"”). 
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7: Differential forms 


The space of 1-forms on R" is itself n-dimensional, and is also called the dual 
space (R")*. 

Suppose that we have chosen a linear coordinate system x,,..., x, on R”. 
Each coordinate x; is itself a 1-form. These n 1-forms are linearly independent. 
Therefore, every 1-form @ has the form 


@ = a,X, +++: + a,X,,; a;E R. 
The value of w on a vector & is equal to 
(8) = a,x, (6) +--+ + a,x, (8), 
where x,(&), ..., x,() are the components of § in the chosen coordinate 


system. 


EXAMPLE. If a uniform force field F is given on euclidean R>, its work A on the displacement § 
is a 1-form acting on & (Figure 135). 


F (force) 


w(€) = (F, &) 


€ (displacement) 


Figure 135 The work of a force is a 1-form acting on the displacement. 


B 2-forms 


Definition. An exterior form of degree 2 (or a 2-form) is a function on pairs of 
vectors w?: R" x R" — R, which is bilinear and skew symmetric: 


w(A,8, + Aro, &3) = Ayw(By, &3) + A. 7(E2, &3) 


(Ei, 52) = —w(E2, 51), 
VAL, A, E R, EL E2, §3 € R". 


EXAMPLE 1. Let S(&;, &) be the oriented area of the parallelogram constructed on the vectors 
E, and &, of the oriented euclidean plane R?, i-e., 

ot 1 oi 
onl e22 


with e,, e, a basis giving the orientation on R?. 
It is easy to see that S(E,, &,) is a 2-form (Figure 136). 


, where &; = €,1€; + €12€2,62 = €21€1 + €22€2, 


SE, 2) al 


EXAMPLE 2. Let v be a uniform velocity vector field for a fluid in three-dimensional oriented 
euclidean space (Figure 137). Then the flux of the fluid over the area of the parallelogram 
&,,6, is a bilinear skew symmetric function of §, and &), i.e., a 2-form defined by the triple scalar 
product 


w(E, §2) = (v, 61, &2). 
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32: Exterior forms 


Figure 136 Oriented area is a 2-form. 


Figure 137 Flux of a fluid through a surface is a 2-form. 


EXAMPLE 3. The oriented area of the projection of the parallelogram with sides &, and & on 
the x,, x2-plane in euclidean R? is a 2-form. 


PROBLEM |. Show that for every 2-form w on R" we have 
w°(6, §) = 0, vée R" 
Solution. By skew symmetry, w?(E, 6) = —w(&, &). 
The set of all 2-forms on R” becomes a real vector space if we define the 
addition of forms by the formula 


(@, + 2), 2) = @1(81, §2) + @2(81, &2) 


and multiplication by scalars by the formula 


(w)(§1, §2) = Ao(§ 1, §2). 


PROBLEM 2. Show that this space is finite-dimensional, and find its dimension. 
ANSWER. n(n — 1)/2: a basis is shown below. 


C k-forms 


Definition. An exterior form of degree k, or a k-form, is a function of k vectors 
which is k-linear and antisymmetric: 


(AS1 + Ar&t, G25. +5 Se) = AoE, &2,.-. i) + Ar (Bi, &2,---, &) 
O(5i,.---> Si.) = (— 1)’a(, ..., &), 
where 
i if the permutation i,,..., i, is even; 


1 if the permutation i,,..., i, is odd. 
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7: Differential forms 


3) 
g) 


Figure 138 Oriented volume is a 3-form. 


EXAMPLE |. The oriented volume of the parallelepiped with edges &,,..., 6, in oriented euclidean 
space R" is an n-form (Figure 138). 


Gin: Cin 
VGis---5 Sn) = 
ent nas San 
where & = €;,e, +--- + &,@, and e,,...,e, are a basis of R". 


EXAMPLE 2. Let R* be an oriented k-plane in n-dimensional euclidean space R”. Then the 
k-dimensional oriented volume of the projection of the parallelepiped with edges €,,&3,..., 
&, € R" onto R' is a k-form on R". 


The set of all k-forms in R” form a real vector space if we introduce 
operations of addition 


(@, + @2)(§) = @,(§) + @2(8), F= {61,---, 8}, 6, R% 


and multiplication by scalars 


(Aw)(G) = Aw(6). 


PROBLEM 3. Show that this vector space is finite-dimensional and find its dimension. 
ANSWER. (j): a basis is shown below. 


D The exterior product of two 1-forms 


We now introduce one more operation: exterior multiplication of forms. 
If w* is a k-form and a! is an /-form on R", then their exterior product w* A a! 
will be a k + [-form. We first define the exterior product of 1-forms, which 
associates to every pair of 1-forms w,, @, on R" a 2-form @, A @, on R". 

Let & be a vector in R". Given two 1-forms w, and w,, we can define a 
mapping of R" to the plane R x R by associating to § € R” the vector w(§) 
with components w,(&) and w,() in the plane with coordinates w,, w, 
(Figure 139). 


Definition. The value of the exterior product @, A w, on the pair of vectors 
&,, 6, € R" is the oriented area of the image of the parallelogram with sides 
(§,) and a(§,) on the w,, w,-plane: 


@1(§1) (81) 
(82) @2(&2) 


(@, A @2)(E1, §2) - 
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32: Exterior forms 


#2) w(E,) 


w(E,) 


1G] 


Figure 139 Definition of the exterior product of two 1-forms 


PROBLEM 4. Show that w, A @, really is a 2-form. 
PRros_em 5. Show that the mapping 
(M1, @2) 7 @, A Wy 
is bilinear and skew symmetric: 
@, A @, = -@, A @,, 
Vo + 2"wi) A @, = NO, A, + AOL A 2. 
Hint. The determinant is bilinear and skew-symmetric not only with respect to rows, but 


also with respect to columns. 


Now suppose we have chosen a system of linear coordinates on R’, i.e., we 
are given n independent 1-forms x,,..., x,. We will call these forms basic. 

The exterior products of the basic forms are the 2-forms x; A x;. By skew- 
symmetry, x; A x; = 0 and x; \ x; = —x; A x;. The geometric meaning of 
the form x; A x; is very simple: its value on the pair of vectors &,, §, is equal 
to the oriented area of the image of the parallelogram &,, §, on the coordinate 
plane x;, x; under the projection parallel to the remaining coordinate 
directions. 


Prostem 6. Show that the (3) = n(n — 1)/2 forms x; A x; (i <j) are linearly independent. 


In particular, in three-dimensional euclidean space (x,, x2, x3), the area 
of the projection on the (x,, x,)-plane is x; A x2, on the (x, x3)-plane it is 
Xy A X3, and on the (x3, x,)-plane it is x3 A x4. 


PROBLEM 7. Show that every 2-form in the three-dimensional space (x,, x2, X3) is of the form 


Pxy \ x3 + Qx3 A x; + Rx, A x2. 
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7: Differential forms 


PROBLEM 8. Show that every 2-form on the n-dimensional space with coordinates x1,...,Xy 
can be uniquely represented in the form 


w? = Do ajjx; A x;. 


i<j 


Hint. Let e; be the i-th basis vector, ie., x(e;) = 1, x,e;) = 0 for i # j. Look at the value of 
the form w? on the pair e;, e;. Then 


aj; = 07(@;, e;). 


E Exterior monomials 


Suppose that we are given k 1-forms ,, ..., @,. We define their exterior 
product w, A -:- A @y,. 


Definition. Set 
(G1) «+> @(81) 
(@, A+++ A O)(G1,---. So) = : : 
O1(E) +++ O(Sy) 
In other words, the value of a product of 1-forms on the parallelepiped 


&,,..-,§, is equal to the oriented volume of the image of the parallelepiped 
in the oriented euclidean coordinate space R* under the mapping § > 


(1), .--, x8). 


PROBLEM 9. Show that w, A --- A @, is a k-form. 
PROBLEM 10. Show that the operation of exterior product of 1-forms gives a multi-linear skew- 
symmetric mapping 
(@1,--., Oy) 7 My A... A Dy. 
In other words, 


(Wo, + Awl) A 2 A A Og = NOL AW. N+ AO + AOL A 2 A+++ A Oy 


and 
W, Act AO, = (—l'@, A+ A Oy, 
where 
0 if the permutation i,,...,i, is even, 
= { if the permutation i,,..., i, is odd. 


Now consider a coordinate system on R" given by the basic forms x,,..., 
X,- The exterior product of k basic forms 


Xi, Nott A Xigs 1 <i,, <n, 


is the oriented volume of the image of a k-parallelepiped on the k-plane 
(x;,, ..+» X;,) under the projection parallel to the remaining coordinate 
directions. 
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32: Exterior forms 


PROBLEM | 1. Show that, if two of the indices i,,..., i, are the same, then the form x;, A +--+ A xi, 
is zero. 


PROBLEM 12. Show that the forms 


Xi, AHA where 1 <i, <i, <---<i, <a, 


ik? 


are linearly independent. 
The number of such forms is clearly ({). We will call them basic k-forms. 


PROBLEM 13. Show that every k-form on R" can be uniquely represented as a linear combination 
of basic forms: 


Din, x Gis, inXiy NTN, 
LSip<- <iggn 


i? 


It follows as a result of this problem that the dimension of the vector space 
of k-forms on R” is equal to (%). In particular, for k = n, (?) = 1, from which 
follows 


Corollary. Every n-form on R" is either the oriented volume of a parallelepiped 
with some choice of unit volume, or zero: 


WM" = a:XyN+++ A Xp. 
PROBLEM 14. Show that every k-form on R" with k > n is zero. 


We now consider the product of a k-form w* and an [-form w’. First, 
suppose that we are given two monomials 


wk =a, A+++A@, and o! = Wye, Ants A Opans 


where @,,..., ©, 4, are 1-forms. We define their product w* ~ «! to be the 
monomial 
(Oy A+++ A Wy) A (Mar Aves A Oqas) 
= Wy AN A Wy A Weg, Noe A Oar 


PROBLEM 15. Show that the product of monomials is associative: 
(o* a o') Ao" = of A (wo! A w™) 
and skew-commutative: 
of 0 a! = (-1}fa! an ok 
Hint. In order to move each of the | factors of w! forward, we need k inversions with the 
k factors of w*. 


Remark. It is useful to remember that skew-commutativity means commutativity only if 
one of the degrees k and / is even, and anti-commutativity if both degrees k and / are odd. 
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33 Exterior multiplication 


We define here the operation of exterior multiplication of forms and show that it is skew- 
commutative, distributive, and associative. 


A. Definition of exterior multiplication 


We now define the exterior multiplication of an arbitrary k-form w* by an 
arbitrary I-form w!. The result w* 4 ' will be ak + [-form. The operation of 
multiplication turns out to be: 


1. skew-commutative: w* a w! = (—-1)"a! a ot; 
2. distributive: (A,a@* + A,0S) a o! = A,ok a a! + A, a8 A a; 
3. associative: (w* A w') A w™ = w* a (a! Aw"). 


Definition. The exterior product w* ~ a! of a k-form o* on R" with an 
I-form w! on R" is the k + I-form on R" whose value on the k + | vectors 


Ey. Cas Seeds ++ +2 Seer € R" is equal to 
(1) (o* A o')(E,,---5Ex41) a »; (- 1ror(E,,,-..,€, 0G; +356 ),)s 


where i, <--- < i,andj, <--> <j (i,---s is ji.---.J1) iS a permutation 
of the numbers (1, 2,...,k + J); and 


1 if this permutation is odd; 
v= ates oer 
0 if this permutation is even. 


In other words, every partition of the k + | vectors ,,..., & +4, into two 
groups (of k and of | vectors) gives one term in our sum (1). This term is equal 
to the product of the value of the k-form w* on the k vectors of the first group 
with the value of the [-form ’ on the | vectors of the second group, with sign 
+ or — depending on how the vectors are ordered in the groups. If they are 
ordered in such a way that the k vectors of the first group and the / vectors of 
the second group written in succession form an even permutation of the 
vectors €,,&3,..., & 41, then we take the sign to be +, and if they form an 
odd permutation we take the sign to be —. 


EXamPLe. If k = / = 1, then there are just two partitions: §,,€, and &), &:. 
Therefore, 


(, A @2)(E1, 2) = ©1(81)@2(2) — @2(61)@1(E2), 
which agrees with the definition of multiplication of 1-forms in Section 32. 


PRosLEM 1. Show that the definition above actually defines a k + I-form (ie., that the value of 
(o* a w!)(E,,..-, 41) depends linearly and skew-symmetrically on the vectors §). 
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33: Exterior multiplication 


B Properties of the exterior product 


Theorem. The exterior multiplication of forms defined above is skew-com- 
mutative, distributive, and associative. For monomials it coincides with the 
multiplication defined in Section 32. 


The proof of skew-commutativity is based on the simplest properties of 
even and odd permutations (cf. the problem at the end of Section 32) and will 
be left to the reader. 

Distributivity follows from the fact that every term in (1) is linear with 
respect to w* and a! 

The proof of associativity requires a little more combinatorics. Since the 
corresponding arguments are customarily carried out in algebra courses for 
the proof of Laplace’s theorem on the expansion of a determinant by column 
minors, we may use this theorem.>* 

We begin with the following observation: if associativity is proved for the 
terms of a sum, then it is also true for the sum, i.e., 


(@, A @2) A @3 = m1 A (@2 A 3) ‘aplies 
(@{ A @2) A @3 = Wf A (@2 A @3) 
((@, + @1) A @2) A @3 = (@, + Wi) A (@2 A Ws). 
For, by distributivity, which has already been proved, we have 
((@1 + @{) A @) A w3 = ((@ A @2) A 3) + ((@{ A @2) A 3), 
(@, + @{) A (@2 A 3) = (@ A (@2 A @3)) + (OT A (@2 A @s)). 
We already know from Section 32 (Problem 13) that every form on R" is a 
sum of monomials; therefore, it is enough to show associativity for multi- 
plication of monomials. 
Since we have not yet proved the equivalence of the definition in Section 
32 of multiplication of k 1-forms with the general definition (1), we will 


temporarily denote the multiplication of k 1-forms by the symbol A, so that 
our monomials have the form 


=@,A-+-Ka, and o = A441 Ao AK Oey 


where @,,..., @,4, are 1-forms. 


53 A direct proof of associativity (also containing a proof of Laplace’s theorem) consists of 
checking the signs in the identity 


((ot A“ w') A o)(E1, nas ng e+ t+m) a, x + o(i.,, ars. 5, JOG j,» eee BO" En, ee Shim)» 


where iy < +++ < iy, jy < +++ <j, hy < +++ < hy (i,,...,4,) is a permutation of the numbers 
(,...,k + 14m). 


171 


7: Differential forms 


Lemma. The exterior product of two monomials is a monomial: 


(@y A+++ K Wy) A (yay Ao K Mess) 
=O, A+++ KOK Oya, Ao AK Oar. 


ProorF. We calculate the values of the left and right sides on k + | vectors 
E1,---,&4,- The value of the left side, by formula (1), is equal to the sum of 
the products 
Y + det |w(&;,,)|- det |a,..)| 
l<isk k<isk+l 

of the minors of the first k columns of the determinant of order k + | and the 
remaining minors. Laplace’s theorem on the expansion by minors of the 
first k columns asserts exactly that this sum, with the same rule of sign choice 
as in Definition (1), is equal to the determinant det|,&,)|. O 


It follows from the lemma that the operations A and A coincide: we get, 
in turn, 


W, A M2, = @, A @2, 
@, A @, K @3 = (@; A @2) A @3 = (@, A @2) A 3, 
M, AO. A+++ KO = (@, A @2) A 3) A+++ A Oy). 


The associativity of A -multiplication of monomials therefore follows from 
the obvious associativity of A -multiplication of 1-forms. Thus, in view of the 
observation made above, associativity is proved in the general case. 


PROBLEM 2. Show that the exterior square of a 1-form, or, in general, of a form of odd order, is 
equal to zero: wk ~ w* = Oif k is odd. 


ExamMPLE |. Consider a coordinate system p,,..., Pao Gagetes gd, on R?" and the 2-form 


oF = Thy Ag 


[ Geometrically, this form signifies the sum of the oriented areas of the projection of a paral- 
lelogram on the n two-dimensional coordinate planes (p,, q,),.--. (Pa Qn). Later, we will see 
that the 2-form w? has a special meaning for hamiltonian mechanics. It can be shown that every 
nondegenerate>* 2-form on R2" has the form w? in some coordinate system (p,,...,q,)-] 


PROBLEM 3. Find the exterior square of the 2-form w?. 


ANSWER. 
w? Aw = -2 pr pprdr q- 


i>j 
PROBLEM 4. Find the exterior k-th power of w?. 


ANSWER. 


2 


w Aw A--+ Aw? = tk! Yo py Ao ADA g, AOA Gig: 


Hema 
k 


54 A bilinear form w? is nondegenerate if VE # 0, 3n: w7(&, n) # 0. See Section 41B. 
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In particular, 


@ A+++ Aw? = tn! py Ac! A Da A Gy N***A Gq 


ee 
n 
is, up to a factor, the volume of a 2n-dimensional parallelepiped in R?". 
EXAMPLE 2. Consider the oriented euclidean space R?. Every vector A € R? determines a 1-form 
wa, by wa(§) = (A, §) (scalar product) and a 2-form 3 by 


ma(&1, 2) = (A, 61, &2) (triple scalar product). 


PRosLeM 5. Show that the maps A > w, and A > «; establish isomorphisms of the linear space 
R? of vectors A with the linear spaces of 1-forms on R* and 2-forms on R?. If we choose an 
orthonormal oriented coordinate system (x,, x2, x;) on R°, then 


OA = Ayx, + Apx2 + A3x3 
and 
We = Ax A x3 + Ayx3 AX, + A3X1 A XD. 
Remark. Thus the isomorphisms do not depend on the choice of the orthonormal oriented 
coordinate system (x,, x2, x3). But they do depend on the choice of the euclidean structure 


on R3, and the isomorphism A — «3 also depends on the orientation (coming implicitly in the 
definition of triple scalar product). 


PROBLEM 6. Show that, under the isomorphisms established above, the exterior product of 
1-forms becomes the vector product in R°, ie., that 
OA A Og = Wap, for any A, Be R?. 


In this way the exterior product of 1-forms can be considered as an extension of the vector 
product in R® to higher dimensions. However, in the n-dimensional case, the product is not a 
vector in the same space: the space of 2-forms on R" is isomorphic to R" only for n = 3. 


PROBLEM 7. Show that, under the isomorphisms established above, the exterior product of a 
1-form and a 2-form becomes the scalar product of vectors in R?: 


Wh A wR = (A, B)x, A x2 A X3. 


C Behavior under mappings 


Let f:R” — R" be a linear map, and w* an exterior k-form on R". Then 
there is a k-form f*w* on R”, whose value on the k vectors &,,..., & € R” 
is equal to the value of w* on their images: 


(f*o*\(E1, ee 9) = o(f§1, oes SE ,)- 


PROBLEM 8. Verify that f*c* is an exterior form. 


PROBLEM 9. Verify that /* is a linear operator from the space of k-forms on R" to the space of 
k-forms on R™ (the star superscript means that f* acts in the opposite direction from /). 


PRosLeM 10. Let f: R" > R" and g: R" > R’. Verify that (go f)* = f*e g*. 


PROBLEM 11. Verify that f* preserves exterior multiplication: f*(w* A w') = (f*w*) a (f*a’). 
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34 Differential forms 


We give here the definition of differential forms on differentiable manifolds. 


A Differential 1-forms 


The simplest example of a differential form is the differential of a function. 


EXxamPLe. Consider the function y = f(x) = x”. Its differential df = 2x dx depends on the 
point x and on the “increment of the argument,” i.e., on the tangent vector & to the x axis. We 
fix the point x. Then the differential of the function at x, df |,, depends linearly on &. So, if x = 1 
and the coordinate of the tangent vector & is equal to 1, then df = 2, and if the coordinate of 
§ is equal to 10, then df = 20 (Figure 140). 


df 


g 


x 


Figure 140 Differential of a function 


Let f:M > R be a differentiable function on the manifold M (we can 
imagine a “function of many variables” f: R" — R). The differential df |, 
of fat x is a linear map 


df: TM, >R 


of the tangent space to M at x into the real line. We recall from Section 18F the 
definition of this map: 

Let §€ TM,, be the velocity vector of the curve x(t): RR > M; x(0) = x 
and x(0) = &. Then, by definition, 


ane) =F] fate. 


t 


Prosiem 1, Let & be the velocity vector of the plane curve x(t) = cost, y(t) = sint at ¢ = 0. 
Calculate the values of the differentials dx and dy of the functions x and y on the vector & 
(Figure 141). 

ANSWER. Ax|1,0(5) = 9 dylao(S) = 1 


Note that the differential of a function fat a point x € M is a 1-form df, on 
the tangent space TM,. 
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x 
Figure 141 Problem 1 


The differential df of fon the manifold M is a smooth map of the tangent 
bundle TM to the line 


df:TM—>R (rm a TMs). 
This map is differentiable and is linear on each tangent space TM, < TM. 


Definition. A differential form of degree 1 (or a 1-form) on a manifold M is a 
smooth map 
o:TM—>R 


of the tangent bundle of M to the line, linear on each tangent space TM,. 


One could say that a differential 1-form on M is an algebraic 1-form on 
TM, which is “differentiable with respect to x.” 


PROBLEM 2. Show that every differential 1-form on the line is the differential of some function. 


PRosLEM 3. Find differential 1-forms on the circle and the plane which are not the differential 
of any function. 


B The general form of a differential 1-form on R" 


We take as our manifold M a vector space with coordinates x,, ..., Xn. 
Recall that the components €,,..., ¢, of a tangent vector &e TR% are the 
values of the differentials dx,,..., dx, on the vector §. These n 1-forms on 
TR” are linearly independent. Thus the 1-forms dx,,...,dx, form a basis for 
the n-dimensional space of 1-forms on TR%, and every 1-form on TR, can 
be uniquely written in the form a, dx; + ++ + a,dx,, where the a; are real 
coefficients. Now let @ be an arbitrary differential 1-form on R". At every 
point x it can be expanded uniquely in the basis dx,,..., dx,. From this we get: 


Theorem. Every differential 1-form on the space R" with a given coordinate 
system X,,...,X, can be written uniquely in the form 
w = a,(x)dx, +--+ + a(x)dx,, 


where the coefficients ax) are smooth functions. 
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x2 


0 1 2 3 a 


Figure 142 Problem 4 


ProsLeM 4. Calculate the value of theforms@, = dx,, @, = x,dx,,and@3 = dr?(r? = x? + x3) 
on the vectors §,, §5, and 3 (Figure 142). 


ANSWER. 


PROBLEM 5. Let x;,..., X, be functions on a manifold M forming a local coordinate system in 
some region. Show that every 1-form on this region can be uniquely written in the form 
@ = a,(x) dx, +--+ + a,(x) dx,. 


C Differential k-forms 


Definition. A differential k-form w*|, at a point x of a manifold M is an exterior 
k-form on the tangent space TM,, to M at x, i.e., a k-linear skew-symmetric 
function of k vectors &,,..., & tangent to M at x. 


If such a form w*|, is given at every point x of the manifold M and if it is 
differentiable, then we say that we are given a k-form w* on the manifold M. 


PRos_eM 6. Put a natural differentiable manifold structure on the set whose elements are k-tuples 
of vectors tangent to M at some point x. 


A differential k-form is a smooth map from the manifold of Problem 6 to 
the line. 


PROBLEM 7. Show that the k-forms on M form a vector space (infinite-dimensional if k does not 
exceed the dimension of M). 


Differential forms can be multiplied by functions as well as by numbers. 
Therefore, the set of C® differential k-forms has a natural structure as a 
module over the ring of infinitely differentiable real functions on M. 
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D The general form of a differential k-form on R" 


Take as the manifold M the vector space R" with fixed coordinate functions 

X1,---,X,:R" > R. Fix a point x. We saw above that the n 1-forms dx,,..., 

dx, form a basis of the space of 1-forms on the tangent space TR". 
Consider exterior products of the basic forms: 


dx;, A+++ A dXxi,, ip <se+ <i. 


In Section 32 we saw that these ({) k-forms form a basis of the space of exterior 
k-forms on TR. Therefore, every exterior k-form on TR can be written 
uniquely in the form 


DY ai, i Ii, Ave A ax 


ip<s<i_ 


ix? 


Now let be an arbitrary differential k-form on R". At every point x it 
can be uniquely expressed in terms of the basis above. From this follows: 


Theorem. Every differential k-form on the space R" with a given coordinate 
system X1,..., X, can be written uniquely in the form 


a= Yai, dx, Av A dXiy, 


where the a;,,..., i,(x) are smooth functions on R". 


PROBLEM 8. Calculate the value of the forms w, = dx; A dx), ) =X, dx, \ dx, — Xx, dx. A 
dx,,and w; = rdr a dg (where x; = rcos g and x, =rsin ~) on the pairs of vectors (&,, n;), 
(§. M2), and (§3, 3) (Figure 143). 


ANSWER. 


(6i1.m) (2,02) (3, m3) 


o,| 1 1 = 
w,; 2 1 = 
o;; 1 1 Af 


Figure 143 Problem 8 
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ProsLeM 9. Calculate the value of the forms w, = dx, A dx3, W2 = x, dx3 A dx,, and 
@3 = dx, A dr? (r? = x? + x3 + x3), on the pair of vectors & = (1, 1, 1), » = (J, 2, 3) at the 
point x = (2, 0, 0). 


ANSWER. @, = 1, @, = —2,@3 = —8. 


PROBLEM 10. Let x,,...,.x,: M — R be functions on a manifold which form a local coordinate 
system on some region. Show that every differential form on this region can be written uniquely in 
the form 


a= Y Gi, ...,i(X) AX}, Avs A dXi,- 


EXAMPLE. Change of variables in a form. Suppose that we are given two 
coordinate systems on R®: x,, x2, x3 and y,, y2, y3. Let w be a 2-form on R?. 
Then, by the theorem above, w can be written in the system of x-coordinates 
as w= X, dx, A dx, + X,dx3 A dx, + X3dx, A dxz, where X,, Xp, 
and X, are functions of x,, x,, and x3, and in the system of y-coordinates as 
w = Y, dy, A dy3 + Y, dy3 A dy, + Y3dy, A dy2, where Y,, Y,, and Y; 
are functions of y,, y2, and y3. 


PROBLEM 11. Given the form written in the x-coordinates (i.e., the X ;) and the change of variables 
formulas x = x(y), write the form in y-coordinates, i.e., find Y. 
Solution. We have dx; = (@x;/Gy,) dy, + (0x;/Oy2) dy, + (0x;/0y3) dy3. Therefore, 


0x, Oxy Oxy ) (= 0x3 0x3 ) 
dx, A dx; = ( dy, + dy, + d A dy, + dy, + dy3}, 
2 3 oy; yi ay, y2 an ¥3 By; y1 a y2 ays 3 


from which we get 


D(x2, X3) 
D1, Y2) 


Y,; = X, , etc. 


a x1) 
2 
D1, Y2) 


[Pon x2) 
t D(y1; Y2) 


E Appendix. Differential forms in three-dimensional spaces 


Let M be a three-dimensional oriented riemannian manifold (in all future 
examples M will be euclidean three-space R*). Let x,, x2, and x3 be local 
coordinates, and let the square of the length element have the form 


ds* = E, dx? + E, dx} + E; dx? 


(i.e., the coordinate system is triply orthogonal). 


Pros_eM 12. Find E,, E,, and E; for cartesian coordinates x, y, z, for cylindrical coordinates 
r, @, z and for spherical coordinates R, ~, 6 in the euclidean space R? (Figure 144). 


ANSWER. 


ds? = dx? + dy? + dz? = dr? + r? dg? + dz? = dR? + R?cos? 6 dg? + R? d6?. 


We let e;, e,, and e, denote the unit vectors in the coordinate directions. 
These three vectors form a basis of the tangent space. 
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Figure 144 Problem 12 


PROBLEM 13. Find the values of the forms dx,, dx , and dx3 on the vectors e,, e,, and e3. 


ANSWER. dx,(e;) = 1 //E;, the rest are zero. In particular, for cartesian coordinates dx(e,) = 
dy(ey) = dz(e,) = 1; for cylindrical coordinates dr(e,) = dz(e,) = 1 and dg(e,) = 1/r (Figure 
145), for spherical coordinates dR(eg) = 1, dp(e,) = 1/R cos # and dO(e,) = 1/R. 


The metric and orientation on the manifold M furnish the tangent space 
to M at every point with the structure of an oriented euclidean three-dimen- 
sional space. In terms of this structure, we can talk about scalar, vector, and 
triple scalar products. 


PRoBLEM 14. Calculate [e,, e2], (€r, €,), and (e,, €,, ey). 


ANSWER. €3, 0, 1. 


In an oriented euclidean three-space every vector A corresponds to a 
1-form w} and a 2-form w%, defined by the conditions 


OA (§) = (A, §) wal’, n) = (A, an n), a NE R*. 


The correspondence between vector fields and forms does not depend on 
the system of coordinates, but only on the euclidean structure and orienta- 
tion. Therefore, every vector field A on our manifold M corresponds to a 
differential 1-form w} on M and a differential 2-form mw? on M. 


Figure 145 Problem 13 
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The formulas for changing from fields to forms and back have a different 
form in each coordinate system. Suppose that in the coordinates x,, x., and 
x3 described above, the vector field has the form 


A = Aye, + A,e, + A3e3 


(the components A; are smooth functions on M). The corresponding 1-form 
w, decomposes over the basis dx;, and the corresponding 2-form over the 
basis dx; A dx;. 


PROBLEM 15. Given the components of the vector field A, find the decompositions of the 1-form 
@4 and the 2-form w. 

Solution. We have waA(e,) = (A,e;) = Ay. Also, (a, dx, + az dx, + a3 dx3)(e,) = 
a, dx,(e,) = ay BX. From this we get that a, = Ai /E1, so that 


wh = A, /E, dx, + Ap,/E2 dx, + Ax/Es dx3. 


In the same way, we have w4(e2, e3) = (A, €2, €3) = A,. Also, 


(a, dx2 A dx3 + a dx3 A dx, + 43 dx, A dxz)\(ez, e€3) = a 


1 

Hence, «, = A,,/E,E;3, i, 
= A,\/E,E3 dx, 4 dx3 + Az/E3E, dxy 0 dx, + A3,/E,Ey dx, 0 dx. 
In particular, in cartesian, cylindrical, and spherical coordinates on R> the vector field 
A = A,e, + Aye, + A,e, = A,e, + Age, + A,e, = Ager + Ages + Ayey 

corresponds to the 1-form 
@k = A,dx + Aydy + A,dz = A,dr + rAgdp + A,dz = AgdR + Ros 0A, dy + RA, dO 
and the 2-form 


w2 = A, dy A dz + A,dz » dx + A, dx A dy 
=rA,dp a dz + Adz A dr+rA,dr a do 
= R? cos 0Ag dp 0 dO + RA, dO A dR + Rcos@A,dR A dg. 


An example of a vector field on a manifold M is the gradient of a function 
f:M—R. Recall that the gradient of a function is the vector field grad f 
corresponding to the differential: 


Opaar =f, ie, df(&) = (grad f,&) VE. 


PROBLEM 16. Find the components of the gradient of a function in the basis e,, e2, e3. 
Solution. We have df = (0f/0x,) dx, + (0f/0x2) dx, + (0f/0x3)dx3;. By the problem above 


df : ae + ie + ! - 
rad f = —— . 
z JE, O1 ; a/ Ez 9X2 : JE; Os : 
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In particular, in cartesian, cylindrical, and spherical coordinates 


FLO COs VO, 


of 
d f= — = 
ere ae opt det op poe et ae 
af Gf 0. Lee 


= + Set gs, 
ar °® Roos00g'* | Ra0° 


35 Integration of differential forms 


We define here the concepts of a chain, the boundary of a chain, and the integration of a form 
over a chain. 

The integral of a differential form is a higher-dimensional generalization of such ideas as the 
flux of a fluid across a surface or the work of a force along a path. 


A The integral of a 1-form along a path 
We begin by integrating a 1-form w! on a manifold M. Let 
y: [(0<t<1]-M 
be a smooth map (the “path of integration”). The integral of the form 


«' on the path » is defined as a limit of Riemann sums. Every Riemann sum 
consists of the values of the form w' on some tangent vectors &; (Figure 146): 


n 
fot =lim ¥ w'@)). 
y A>0 i=1- 
The tangent vectors &; are constructed in the following way. The interval 
0 <t < 1 is divided into parts A;:t; < t < t;,, by the points t;. The interval 
A; can be looked at as a tangent vector A, to the ¢ axis at the point t;. Its 
image in the tangent space to M at the point y(t,) is 


i = ay lL (Ad€ TM yay: 
The sum has a limit as the largest of the intervals A; tends to zero. It is 
called the integral of the 1-form w! along the path y. 


The definition of the integral of a k-form along a k-dimensional surface 
follows an analogous pattern. The surface of integration is partitioned into 


ie 
Aj 


-——_-—t+\ HF 
tj 


Figure 146 Integrating a 1-form along a path 
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Figure 147 Integrating a 2-form over a surface 


small curvilinear k-dimensional parallelepipeds (Figure 147); these paral- 
lelepipeds are replaced by parallelepipeds in the tangent space. The sum of the 
values of the form on the parallelepipeds in the tangent space approaches 
the integral as the partition is refined. We will first consider a particular case. 


B The integral of a k-form on oriented euclidean space R* 


Let x,,..., x, be an oriented coordinate system on R*. Then every k-form 
on R* is proportional to the form dx, A --- A dx,, ie. it has the form 
awk = o(x)dx, A +--+ A dx,, where g(x) is a smooth function. 

Let D be a bounded convex polyhedron in R* (Figure 148). By definition, 
the integral of the form w* on D is the integral of the function 9: 


| ok = i @(x)dx 4, ..., AX; 
D D 


where the integral on the right is understood to be the usual limit of Riemann 
sums. 

Such a definition follows the pattern outlined above, since in this case the 
tangent space to the manifold is identified with the manifold. 


ProsLeM 1. Show that |p w* depends linearly on *. 


PROBLEM 2. Show that if we divide D into two distinct polyhedra D, and D,, then 


In the general case (a k-form on an n-dimensional space) it is not so easy 
to identify the elements of the partition with tangent parallelepipeds; we will 
consider this case below. 


Figure 148 Integrating a k-form in k-dimensional space 


182 


35: Integration of differential forms 


C The behavior of differential forms under maps 


Let f: M > N bea differentiable map of a smooth manifold M to a smooth 
manifold N, and let be a differential k-form on N (Figure 149). Then, a 
well-defined k-form arises also on M: it is denoted by f*w and is defined by 
the relation 


(f*ay&i, teeny aa) = of, 51, see as) 


for any tangent vectors §,,..., §£¢ 17M,. Here f, is the differential of the 
map f. In other words, the value of the form f*q on the vectors &,,..., &, is 
equal to the value of w on the images of these vectors. 


N 
| CF) 
a 
fa 


Figure 149 A form on N induces a form on M. 


R 


EXamMPLe. If y = f(x,, x2) = x7 + x3 and w = dy, then 


f*a@ = 2x, dx; + 2x2 dx. 


PROBLEM 3. Show that f *w is a k-form on M. 


PROBLEM 4. Show that the map f* preserves operations on forms: 


S*A,@, + A,@2) = A, f*(@1) + Ay f*(@2), 
f*(@, A W2) = (f*a,) A (f*ay). 


PROBLEM 5. Let g: L + M bea differentiable map. Show that (fg)* = g*f*. 


PRosLEM 6. Let D, and D, be two compact, convex polyhedra in the oriented k-dimensional 
space R* and f:D, > D, a differentiable map which is an orientation-preserving diffeomor- 
phism>> of the interior of D, onto the interior of D,. Then, for any differential k-form w* on D,, 


[fre = [io 


Hint. This is the change of variables theorem for a multiple integral: 


OV 1, ++ Yn) 
J, Oxy, 5 Xp) PLVx))dxy +++ dxX_ = J omy ++ dyn. 


55 i.e, one-to-one with a differentiable inverse. 
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D Integration of a k-form on an n-dimensional manifold 


Let w be a differential k-form on an n-dimensional manifold M. Let D be a 
bounded convex k-dimensional polyhedron in k-dimensional euclidean 
space R* (Figure 150). The role of “path of integration” will be played by a 


Figure 150 Singular k-dimensional polyhedron 


k-dimensional cell*° o of M represented by a triple o = (D, f, Or) consisting 
of 


1. a convex polyhedron D c R*, 
2. a differentiable map f: D > M, and 
3. an orientation on R*, denoted by Or. 


Definition. The integral of the k-form w over the k-dimensional cell o is the 
integral of the corresponding form over the polyhedron D 


fe = [ sre. 


PROBLEM 7. Show that the integral depends linearly on the form: 
f A,@, + A,@, =A, fo +A, fou. 


The k-dimensional cell which differs from o only by the choice of orienta- 
tion is called the negative of o and is denoted by —o or —1-o (Figure 151). 


hd Ld 


Figure 151 Problem 8 


PRosLem 8. Show that, under a change of orientation, the integral changes sign: 
| o=- o. 
~-o a 


56 The cell o is usually called a singular k-dimensional polyhedron. 
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E Chains 


The set f(D) is not necessarily a smooth submanifold of M. It could have 
“self-intersections” or “folds” and could even be reduced to a point. How- 
ever, even in the one-dimensional case, it is clear that it is inconvenient to 
restrict ourselves to contours of integration consisting of one piece: it is 
useful to be able to consider contours consisting of several pieces which can 
be traversed in either direction, perhaps more than once. The analogous 
concept in higher dimensions is called a chain. 


Definition. A chain of dimension k ona manifold M consists of a finite collection 
of k-dimensional oriented cells o,, ..., o, in M and integers m,, ..., m,, 
called multiplicities (the multiplicities can be positive, negative, or zero). 
A chain is denoted by 


Ch = m0, + +--+ M,G,. 


We introduce the natural identifications 
mo + m,0 = (m, + m,)o 
m0, +m,0, =m,0,+m,o, 0c =0 Cy + O = cy. 


PROBLEM 9. Show that the set of all k-chains on M forms a commutative group if we define the 
addition of chains by the formula 


, 


(mo, +--+ + m,o,) + (myo, +--+ + m,G,,) = myo, +--> + m,o, + mo, +--- + m,¢,,. 


F Example: the boundary of a polyhedron 


Let D be a convex oriented k-dimensional polyhedron in k-dimensional 
euclidean space R*. The boundary of D is the (k — 1)-chain 0D on R* defined 
in the following way (Figure 152). 

The celis o; of the chain OD are the (k — 1)-dimensional faces D, of the 
polyhedron D, together with maps /;: D; > R* embedding the faces in R* and 
orientations Or; defined below; the multiplicities are equal to 1: 


6D = Ly 0; 6; = (D;, fi Or;). 


Rule of orientation of the boundary. Let e,,..., e, be an oriented frame in 
IR*. Let D; be one of the faces of D. We choose an interior point of D; and there 


Figure 152 Oriented boundary 
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construct a vector n outwardly normal to the polyhedron D. An orienting 
frame for the face D; will be a frame f,, ..., f,_, on D; such that the frame 
(n, f,,..., f,- :) is oriented correctly (i.e., the same way as the frame e,, .. . , €;). 

The boundary of a chain is defined in an analogous way. Let o = (D, f, Or) 
be a k-dimensional cell in the manifold M. Its boundary do is the (k — 1) 
chain: do = }' a; consisting of the cells o, = (D;, f;, Or;), where the D; are 
the (k — 1)-dimensional faces of D, Or; are orientations chosen by the rule 
above, and f; are the restrictions of the mapping f: D > M to the face D;. 

The boundary dc, of the k-dimensional chain c, in M is the sum of the 
boundaries of the cells of c, with multiplicities (Figure 153): 


6c, = OAm,o, +++: + m,o,) = m, do, +--- +m, do,. 


Obviously, dc, is a (k — 1)-chain on M.°7 


dc2 


C2 


Figure 153 Boundary of a chain 


PROBLEM 10. Show that the boundary of the boundary of any chain is zero: ddc, = 0. 

Hint. By the linearity of 0 it is enough to show that d0D = 0 for a convex polyhedron D. It 
remains to verify that every (k — 2)-dimensional face of D appears in 0@D twice, with opposite 
signs. It is enough to prove this for k = 2 (planar cross-sections). 


G The integral of a form over a chain 


Let w* be a k-form on M, and c, a k-chain on M, c, = ), mjo;. The integral 
of the form w* over the chain c, is the sum of the integrals on the cells, counting 


multiplicities: 
{ ook = 3 mM; [ ok, 
Ck Ci 


PROBLEM 11. Show that the integral depends linearly on the form: 


forso= [ots [ at. 


Ck Ck Ck 


PROBLEM 12. Show that integration of a fixed form * on chains c, defines a homomorphism from 
the group of chains to the line. 


57 We are taking k > 1 here. One-dimensional chains are included in the general scheme if we 
make the following definitions: a zero-dimensional chain consists of a collection of points with 
multiplicities; the boundary of an oriented interval AB is B — A (the point B with multiplicity 1 
and A with multiplicity — 1); the boundary of a point is empty. 
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EXAMPLE 1|.Let M be the plane {(p, q)}, w! the form pdq, and c, the chain consisting of one cell ¢ 
with multiplicity 1: 


(0 <t <2] 4(p = cost,q = sin?). 


Then j., pdq = x. In general, ifa chain c, represents the boundary of a region G (Figure 154), then 
Se, pdgq is equal to the area of G with sign + or — depending on whether the pair of vectors 
(outward normal, oriented boundary vector) has the same or opposite orientation as the pair 
(p axis, q axis). 


pdq 


Figure 154 The integral of the form p dq over the boundary of a region is equal to the 
area of the region. 


EXAMPLE 2. Let M be the oriented three-dimensional euclidean space R?. Then every 1-form on 
M corresponds to some vector field A (@! = w4), where 
oA(8) = (A, 8). 


The integral of w4 on a chain c, representing a curve | is called the circulation of the field A 


over the curve |: 
) ai = fa dl). 
C1 L 


Every 2-form on M also corresponds to some field A (w? = wi, where wi(E, n) = (A, &, 9). 
The integral of the form w3 on a chain c, representing an oriented surface S is called the 
flux of the field A through the surface S: 


[oa = [a dn). 


PROBLEM 13. Find the flux of the field A = (1/R7)eg over the surface of the sphere x? + y? + z? = 
1, oriented by the vectors e,, e, at the point z = 1. Find the flux of the same field over the surface 
of the ellipsoid (x?/a”) + (y?/b?) + z? = 1 oriented the same way. 

Hint. Cf. Section 36H. 


PROBLEM 14. Suppose that, in the 2n-dimensional space R" = {(p1,..., Dai Q1,--->n)}, We are 
given a 2-chain c, representing a two-dimensional oriented surface S with boundary I. Find 


f dp, \ dq, +--+: + dp, A dq, and fovea +--+ + p,dq,. 
C2 l 


ANSweR. The sum of the oriented areas of the projection of S on the two-dimensional coordinate 
planes p;, qj. 
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36 Exterior differentiation 


We define here exterior differentiation of k-forms and prove Stokes’ theorem: the integral of the 
derivative of a form over a chain is equal to the integral of the form itself over the boundary of 
the chain. 


A Example: the divergence of a vector field 


The exterior derivative of a k-form w on a manifold M is a (k + 1)-form dw 
on the same manifold. Going from a form to its exterior derivative is analo- 
gous to forming the differential of a function or the divergence of a vector 
field. We recall the definition of divergence. 


ell g, 


Figure 155 Definition of divergence of a vector field 


Let A be a vector field on the oriented euclidean three-space R°, and let S 
be the boundary of a parallelepiped I with edges €,, €,, and &; at the vertex x 
(Figure 155). Consider the (“outward”) flux of the field A through the 
surface S: 


F(T) = [a dn). 


If the parallelepiped I is very small, the flux F is approximately propor- 
tional to the product of the volume of the parallelepiped, V = (€, &5, §3), 
and the “source density” at the point x. This is the limit 


ine) 
e>0 es V 


where ell is the parallelepiped with edges ¢¢,, ¢6,, 663. This limit does not 
depend on the choice of the parallelepiped ITI but only on the point x, and is 
called the divergence, div A, of the field A at x. 

To go to higher-dimensional cases, we note that the “flux of A through a 
surface element” is the 2-form which we called w?. The divergence, then, 
is the density in the expression for the 3-form 


ow? = div Adx a dy A dz, 
wri, £,,§3) = div A- V(E1, 2, §3), 


characterizing the “sources in an elementary parallelepiped.” 
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The exterior derivative dw* of a k-form w* on an n-dimensional manifold 
M may be defined as the principal multilinear part of the integral of w* over 
the boundaries of (k + 1)-dimensional parallelepipeds. 


B Definition of the exterior derivative 


We define the value of the form dw onk + 1 vectors&,,..., & 41 tangent toM 
at x. To do this, we choose some coordinate system in a neighborhood of x 
on M, i.e., a differentiable map f of a neighborhood of the point 0 in euclidean 
space R” to a neighborhood of x in M (Figure 156). 


ae fear f 
TI 
ceemeeea, 
& +1 
2 g 
g* x q 
a 
f e 
1 R” M 


Figure 156 The curvilinear parallelepiped IT. 


The pre-images of the vectors &,,..., & 4, € 7M, under the differential 
of f lie in the tangent space to R" at 0. This tangent space can be naturally 
identified with R”, so we may consider the pre-images to be vectors 


67,...,64,, ER” 


We take the parallelepiped II* in R" spanned by these vectors (strictly 
speaking, we must look at the standard oriented cube in R**! and its linear 
map onto II*, taking the edges e,,..., &,4, to &f,..., &*,,, as a (k + 1)- 
dimensional cell in R"). The map f takes the parallelepiped II* to a (kK + 1)- 
dimensional cell on M (a “curvilinear parallelepiped”). The boundary of the 
cell I is a k-chain, 611. Consider the integral of the form w* on the boundary 
onl of IT: 


Fi, sey eet 1) _ [a 


EXAMPLE. We will call a smooth function g: M > R a0-form on M. The integral of the 0-form @ 
on the 0-chain cg = ). m; A; (where the m, are integers and the A; points of M) is 


| @ = Yi m,9(A)). 


co 


Then the definition above gives the “increment” F(E,) = o(x,) — @(x) (Figure 157) of the 
function g, and the principal linear part of F(€,) at 0 is simply the differential of o. 


PROBLEM 1. Show that the function F(&,,..., 4) is skew-symmetric with respect to &. 


It turns out that the principal (k + 1)-linear part of the “increment” 
F(&,,..., & +1) is an exterior (k + 1)-form on the tangent space TM, to M 
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Figure 157 The integral over the boundary of a one-dimensional parallelepiped is the 
change in the function. 


at x. This form does not depend on the coordinate system that was used to 
define the curvilinear parallelepiped IT. It is called the exterior derivative, or 
differential, of the form w* (at the point x) and is denoted by da. 


C A theorem on exterior derivatives 


Theorem. There is a unique (k + 1)-form Q on TM, which is the principal 
(k + 1)-linear part at 0 of the integral over the boundary of a curvilinear 
parallelepiped, F(&;,..., &, 41); ie, 


(1) F(ey, «+ +5 e641) = 7 7QG),..-, Sei) + o(**") (e+ 0). 


The form Q does not depend on the choice of coordinates involved in the 
definition of F. If, in the local coordinate system x,,..., x, on M, the form 
w* is written as 


ok = » iy, ...,in AXi, N7+ A Xi, 
then Q. is written as 
(2) Q = dok = ¥ da;, i, A dxi, A+++ A dX, 
We will carry out the proof of this theorem for the case of a form w! = 
a(x,, X2)dx, on the x,, x. plane. The proof in the general case is entirely 
analogous, but the calculations are somewhat longer. 


We calculate F(E, 9), i.c., the integral of w! on the boundary of the paral- 
lelogram IT with sides € and y and vertex at 0 (Figure 158). The chain ¢IT is 


x2 


nt & 


&t 


x] 


Figure 158 Theorem on exterior derivatives 
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given by the mappings of the interval 0 < t < 1 to the plane t > Et, r> 
§ + nt,t > nt,andt > y + & with multiplicities 1, 1, —1,and —1. Therefore, 


[ w! = | Lat) — a&e + w]e, — Lane) — ane + 8), at 
én 0 


where ¢€, = dx,(§), 1, = dx,(m), ¢, = dx,(6), and n, = dx.(y) are the 
components of the vectors § and y. But 


da da Be a 
a(§t + n) — a(§t) = ax! + ax,” + O(§*, n°) 
(the derivatives are taken at x, = x, = 0). In the same way 
da 0a Sct 
a(nt + §) — a(nt) oe + Be + O67, 0’). 
By using these expressions in the integral, we find that 


Oa 
Few = | ot = Gn ~ bam) + o(84 02) 
an X2 
The principal bilinear part of F, as promised in (1), turns out to be the value 
of the exterior 2-form 
0a 


aR A dx, 


on the pair of vectors § 9. Thus the form obtained is given by formula (2), 
since 


da 


0 ts) 
des A dx, = ae A dx. 
Ox, 


dx, A dx, + Be; ax 


da A dx, = 


Finally, if the coordinate system x,, x2 is changed to another (Figure 159), 
the parallelogram II is changed to a nearby curvilinear parallelogram IT’, so 
that the difference in the values of the integrals, fon o! — fone a! will be 
small of more than second order (prove it!). O 


Figure 159 Independence of the exterior derivative from the coordinate system 
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PROBLEM 2. Carry out the proof of the theorem in the general case. 


PROBLEM 3. Prove the formulas for differentiating a sum and a product: 
d(w, + 2) = dw, + dw). 
and 


d(wa* a w') = dwk a w! + (—1)kak a do!, 
PROBLEM 4. Show that the differential of a differential is equal to zero: dd = 0. 


Prose 5. Let f: M > N be a smooth map and w a k-form on N. Show that f*(dw) = d(f *w). 


D Stokes’ formula 


One of the most important corollaries of the theorem on exterior derivatives 
is the Newton-Leibniz-Gauss-Green-Ostrogradskii-Stokes-Poincaré for- 
mula: 


(3) [e = [ee 


where c is any (k + 1)-chain on a manifold M and w is any k-form on M. 

To prove this formula it is sufficient to prove it for the case when the chain 
consists of one cell o. We assume first that this cell o is given by an oriented 
parallelepiped 1 c R**! (Figure 160). 


Figure 160 Proof of Stokes’ formula for a parallelepiped 


We partition IT into N**? small equal parallelepipeds 1; similar to IT. 
Then, clearly, 


Nkt+1 
) w= ) Fj, where F; = | . 
on i=1 en, 
By formula (1) we have 
F; = dax&}, sees Ga) + o(N~&*?)), 


where &},...,&+1 are the edges of T;. But )M;' dow(&},..., 841) is a 
Riemann sum for |, dw. It is easy to verify that o(N~ “*) is uniform, so 


NK+1 Nkt1 


lim > F;= lim Y da(€i,..., 84.) = { dw. 
n 


N>o i=1 N>o i=l 
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Finally, we obtain 


i o=) F,= lim ) F,= [ deo 
on N-> oo Tl 
Formula (3) follows automatically from this for any chain whose polyhedra 
are parallelepipeds. 

To prove formula (3) for any convex polyhedron D, it is enough to prove 
it for a simplex,°® since D can always be partitioned into simplices (Figure 
161): 


D=>D;  @D=Y.6D,. 


Figure 161 Division of a convex polyhedron into simplices 


oH A 
Figure 162 Proof of Stokes’ formula for a simplex 


We will prove formula (3) for a simplex. Notice that a k-dimensional 
oriented cube can be mapped onto a k-dimensional simplex so that: 


1. The interior of the cube goes diffeomorphically, with its orientation 
preserved, onto the interior of the simplex; 

2. The interiors of some (k — 1)-dimensional faces of the cube go diffeo- 
morphically, with their orientations preserved, onto the interiors of the 
faces of the simplex; the images of the remaining (k — 1)-dimensional 
faces of the cube lie in the (k — 2)-dimensional faces of the simplex. 


For example, for k = 2 such a map of the cube 0 < x,, x, < 1 onto the 
triangle is given by the formula y, = x,, y. = x,x2 (Figure 162). Then, 


58 A two-dimensional simplex is a triangle, a three-dimensional simplex is a tetrahedron, a 
k-dimensional simplex is the convex hull of k + 1 points in R" which do not lie in any k — 1- 
dimensional plane. 

EXAMPLE: {x € R*: x; > O and Y¥_, x; < 1}. 
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formula (3) for the simplex follows from formula (3) for the cube and the 
change of variables theorem (cf. Section 35C). 


EXAMPLE 1. Consider the 1-form 
wo = p, dq, +--+ + prdq, = p dq 


on R?" with coordinates p;,..., Dns Gis+++>n- Phen dw! = dp, A dq, + °°: 
+ dp, A dq, = dp A dq, so 


) { dp A dq = P dq. 
c2 6c2 


In particular, if c, is a closed surface (dc, = 0), then {f,, dp » dq = 0. 


E Example 2—Vector analysis 


In a three-dimensional oriented riemannian space M, every vector field A 
corresponds to a 1-form w4 and a 2-form w?2. Therefore, exterior differentia- 
tion can be considered as an operation on vectors. 

Exterior differentiation of 0-forms (functions), 1-forms, and 2-forms cor- 
respond to the operations of gradient, curl, and divergence defined by the 
relations 


df = grad f da} = Oeurt A da = (div Ajo? 


(the form w? is the volume element on M). Thus, it follows from (3) that 


{Gy 24Os= | gradfdl ifal=y—x 


faa- [ff eurt adn if oS = 1 
I s 


[[aa= |[[ iv ayo* irap=s. 


PROBLEM 6. Show that 
div[A, B] = (curl A, B) — (curl B, A), 
curl aA = [grad a, A] + acurl A, 
div aA = (grad a, A) + adiv A. 
Hint. By the formula for differentiating the product of forms, 
d(opapy) = d(@A A wp) = dwA A WR - WA A dog. 


PROBLEM 7. Show that curl grad = div curl = 0. 
Hint. dd = 0. 
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F Appendix 1: Vector operations in triply orthogonal systems 


Let x,, x2, x3 be a triply orthogonal coordinate system on M, ds* = 
E, dx? + E, dx3 + E, dx3 and e; the coordinate unit vectors (cf. Section 
34E). 


ProseM 8. Given the components of a vector field A = A,e, + A,e, + A3@3, find the compo- 
nents of its curl. 
Solution. According to Section 34E 


ol = A,/E; dx, + A2,/E; dx, + A3/E3 dx,. 


Therefore, 


= 4 
= Meurta- 


0A3,/E; 04,,/E, 
ios =( av = 2 Joe nde t 


Ox, ° 0x3 


According to Section 34E, we have 


JE1 J ELE, JExe; 
1 Gee: ete) . 1 a a a 
_ ie 
VELEs 


curl A = 


EE, Ey Ox, 0x5 0x3 
A JE, ArJEx As/Es 
In particular, in cartesian, cylindrical, and spherical coordinates on R°, 
0A 0A OA 0A 0A 0A 
curl A = (——~ — —Je, + (= ‘Je + ( u ‘Je 
( oy Oz ( Oz éx}” ox dy 
1/24, ard), | (24, 04.) 1 (arg 04, 
~ rag cee i ae dey Pee Io) “ 
1 (“ 0A, cos " 1 (“A a) 1 (4s 1 *Ar) 
R gr a 


= ee a3 = eats 
R cos 0 \d@ a0 R\00 OR Ri OR 080 Go). 


Ox, 0x3 


PROBLEM 9. Find the divergence of the field A = A,e, + A,e, + A3e3. 
Solution. o% = A,,/E,E3 dx, A dx, + ---. Therefore, 


0 
doz = Jy, (Atv Ea Es) dx A dx, A dxX3+-°°. 
xy 
By the definition of divergence, 


dwi = div A\/E,E,E; dx, dx, A dx3. 


This means 


1 0 0 0 
div A = ———— (— A,,/E,E, + — A,/E3E, + — A3V/ EE, }. 
1V aoe. 1 23 ax, 2 3/1 8x3 3 1 ) 
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In particular, in cartesian, cylindrical, and spherical coordinates on R3: 


0A 0A CA 1 [orA 0A 0A 
divA =~ + —7 + —=- 4 ‘) : 
- Ox oy éz or ( or 0) 0z 
1 (= cos OAR as ORA, sg OR cos =A) 
~ R? cos 6 OR 0p 60 : 


PROBLEM 10. The Laplace operator on M is the operator A = div grad. Find its expression in the 
coordinates x;. 


ANSWER. 


1 0 E,E; 7) | 
Af = aoe | wd, 
: JE, ELE; FA E, 0x, ” 


Oy OF coh 0 AG. Vey oF 


ax? + By? * G22 Get rar * Op? dz? 
1 a i) a ( 1 a a za 
= ——— |— [R?’ cos 0 + 6—})]. 
R? lal ee OR do \cos 0 do * 36 eG 


G Appendix 2: Closed forms and cycles 


In particular, on R? 


Af = 


The flux of an incompressible fluid (without sources) across the boundary 
of a region D is equal to zero. We will formulate a higher-dimensional 
analogue to this obvious assertion. The higher-dimensional analogue of an 
incompressible fluid is called a closed form. The field A has no sources if 
divA = 0. 


Definition. A differential form @ on a manifold M is closed if its exterior 
derivative is zero: dw = 0. 


In particular, the 2-form wi corresponding to a field A without sources 
is closed. Also, we have, by Stokes’ formula (3): 


Theorem. The integral of a closed form w* over the boundary of any (k + 1)- 
dimensional chain c,. is equal to zero: 


{ of =0 ifdo* =0. 
Ock +1 


PROBLEM 11. Show that the differential of a form is always closed. 


On the other hand, there are closed forms which are not differentials. For 
example, take for M the three-dimensional euclidean space R* without O: 
M =R?— 0, with the 2-form being the flux of the field A =(1/R7)ep 
(Figure 163). It is easy to convince oneself that div A = 0, so that our 2-form 
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Figure 163 The field A 


«4 is closed. At the same time, the flux over any sphere with center O is equal 
to 4x. We will show that the integral of the differential of a form over the 
sphere must be zero. 


Definition. A cycle on a manifold M is a chain whose boundary is equal to 
zero. 


The oriented surface of our sphere can be considered to be a cycle. It 
immediately follows from Stokes’ formula (3) that 


Theorem. The integral of a differential over any cycle is equal to zero: 
{ dw* = 0 if OCy41 = 0. 


Thus, our 2-form w? is not the differential of any 1-form. 

The existence of closed forms on M which are not differentials is related 
to the topological properties of M. One can show that every closed k-form 
on a vector space is the differential of some (k — 1)-form (Poincaré’s lemma). 


PROBLEM 12. Prove Poincaré’s lemma for 1-forms. 
Hint. Consider J%; w! = g(x). 


PROBLEM 13. Show that in a vector space the integral of a closed form over any cycle is zero. 
Hint. Construct a (k + 1)-chain whose boundary is the given cycle (Figure 164). 


Figure 164 Cone over a cycle 
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Namely, for any chain c consider the “cone over c with vertex 0.” If we denote the operation 
of constructing a cone by p, then 


Oop+pced=1 (the identity map). 
Therefore, if the chain c is closed, O(pc) = c. 
PROBLEM. Show that every closed form on a vector space is an exterior derivative. 


Hint. Use the cone construction. Let w* be a differential k-form on R". We define a (k — 1)- 
form (the “co-cone over w”) pw* in the following way: for any chain c,_, 


| po =| ok, 
Ck-4 PCy -1 


It is easy to see that the (k — 1)-form pw* exists and is unique; its value on the vectors 
&,,...,&—1, tangent to R” at x, is equal to 


(po) (E1, -- +5 5-1) = fo 0, (X, 161, ..., 06,1 )dt. 


It is easy to see that 
dop+ped=1 (the identity map). 


Therefore, if the form w* is closed, d(pw*) = w*. 


PRos_em. Let X be a vector field on M and w a differential k-form. We define a differential 
(k — 1)-form ix @ (the interior derivative of w by X) by the relation 


(ix), «5 Se 1) = O(X% By, --- 5 Ge 1) 
Prove the homotopy formula 
ixd + dix el Lx, 


where Lx is the differentiation operator in the direction of the field X. 
[The action of Ly on a form is defined, using the phase flow {g'} of the field X, by the relation 


d 
(Lx@)(§) = at og’, §)- 
=0 


it 


Ly is called the Lie derivative or fisherman's derivative: the flow carries all possible differential- 
geometric objects past the fisherman, and the fisherman sits there and differentiates them. ] 

Hint. We denote by H the “homotopy operator” associating to a k-chain y:0 + M the 
(k + 1)-chain Hy: (I x ¢) > M according to the formula (Hy)(t, x) = g'y(x) (where I = [0, 1)). 
Then 


g'y — y = Hy) + H(6y). 
PROBLEM. Prove the formula for differentiating a vector product on three-dimensional euclidean 
space (or on a riemannian manifold): 
curl[a, b] = {a, b} + adiv b — bdiva 


(where {a, b} = L,b is the Poisson bracket of the vector fields, cf. Section 39). 
Hint. If t is the volume element, then 


icurtja,vit = digit diva=di,t and {a,b} = Lb; 


by using these relations and the fact that dt = 0, it is easy to derive the formula for curl[a, b] from 
the homotopy formula. 
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H Appendix 3: Cohomology and homology 


The set of all k-forms on M is a vector space, the closed k-forms a sub- 
space and the differentials of (k — 1)-forms a subspace of the subspace of 
closed forms. The quotient space 


{closed forms} 


— yk 
{differentials} ne”) 


is called the k-th cohomology group of the manifold M. An element of this 
group is a class of closed forms differing from one another only by a differ- 
ential. 

Pros_em 14. Show that for the circle S' we have H'(S', R) = R. 


The dimension of the space H*(M, R) is called the k-th Betti number of M. 


PROBLEM 15. Find the first Betti number of the torus T? = S! x S!. 


The flux of an incompressible fluid (without sources) over the surfaces of 
two concentric spheres is the same. In general, when integrating a closed form 


a 


Figure 165 Homologous cycles 


over a k-dimensional cycle, we can replace the cycle with another one pro- 
vided that their difference is the boundary of a (k + 1)-chain (Figure 165): 


fot = iG 
a b 
ifa = b a= OCy +1 and du* a 0. 


Poincaré called two such cycles a and b homologous. 
With a suitable definition®? of the group of chains on a manifold M and its 


5° For this our group {c,} must be made smaller by identifying pieces which differ only by the 
choice of parametrization f or the choice of polyhedron D. In particular, we may assume that 
D is always one and the same simplex or cube. Furthermore, we must take every degenerate 
k-cell (D, f, Or) to be zero, i.e.,(D, f Or) = Oif f = f,- f,, where f,: D > D’ and D' has dimension 
smaller than k. 
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subgroups of cycles and boundaries (i.e., cycles homologous to zero), the 
quotient group 


{cycles} 
{boundaries} — H,(M) 


is called the k-th homology group of M. 
An element of this group is a class of cycles homologous to one another. 
The rank of this group is also equal to the k-th Betti number of M (“De 
Rham’s Theorem”). 
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A symplectic structure on a manifold is a closed nondegenerate differential 
2-form. The phase space of a mechanical system has a natural symplectic 
structure. 

On a symplectic manifold, as on a riemannian manifold, there is a natural 
isomorphism between vector fields and 1-forms. A vector field on a sym- 
plectic manifold corresponding to the differential of a function is called a 
hamiltonian vector field. A vector field on a manifold determines a phase 
flow, i.e., a one-parameter group of diffeomorphisms. The phase flow of a 
hamiltonian vector field on a symplectic manifold preserves the symplectic 
structure of phase space. 

The vector fields on a manifold form a Lie algebra. The hamiltonian 
vector fields on a symplectic manifold also form a Lie algebra. The operation 
in this algebra is called the Poisson bracket. 


37 Symplectic structures on manifolds 


We define here symplectic manifolds, hamiltonian vector fields, and the standard symplectic 
structure on the cotangent bundle. 


A Definition 


Let M?" be an even-dimensional differentiable manifold. A symplectic 
structure on M2" is a closed nondegenerate differential 2-form w? on M?": 


dw? =0 and V640I:@7(En) 40 (Ne TM,). 
The pair (M2", w”) is called a symplectic manifold. 


201 


8: Symplectic manifolds 


EXAMPLE, Consider the vector space R2" with coordinates p;, q; and let w? = Y dp; A dq;. 


PRosLeEM. Verify that (R2", w”) is a symplectic manifold. For n = 1 the pair (R?, w) is the pair 
(the plane, area). 


The following example explains the appearance of symplectic manifolds 
in dynamics. Along with the tangent bundle of a differentiable manifold, it is 
often useful to look at its dual—the cotangent bundle. 


B The cotangent bundle and its symplectic structure 


Let V be an n-dimensional differentiable manifold. A 1-form on the tangent 
space to V at a point x is called a cotangent vector to V at x. The set of all 
cotangent vectors to V at x forms an n-dimensional vector space, dual to 
the tangent space TV,. We will denote this vector space of cotangent vectors 
by T*V, and call it the cotangent space to V at x. 

The union of the cotangent spaces to the manifold at all of its points is 
called the cotangent bundle of V and is denoted by T*V. The set T*V has a 
natural structure of a differentiable manifold of dimension 2n. A point of 
T*V is a 1-form on the tangent space to V at some point of V. If q is a choice 
of n local coordinates for points in V, then such a form is given by its n com- 
ponents p. Together, the 2n numbers p, q form a collection of local coordinates 
for points in T*V. 

There is a natural projection f: T*V — V (sending every 1-form on TV, to 
the point x). The projection f is differentiable and surjective. The pre-image 
of a point x € V under f is the cotangent space T*V,. 


Theorem. The cotangent bundle T*V has a natural symplectic structure. In the 
local coordinates described above, this symplectic structure is given by the 
formula 


w? = dp a dq = dp, A dq, + -:- + dp, A dqy. 


Proor. First, we define a distinguished 1-form on T*V. Let §€ T(T*V), be 
a vector tangent to the cotangent bundle at the point pe T*V, (Figure 166). 
The derivative f,: T(T*V) > TV of the natural projection f: T*V > V takes 
§ to a vector f,§ tangent to V at x. We define a 1-form aw) on T*V by the 
relation w'(&) = p(f,§). In the local coordinates described above, this form 
is w' = pdq. By the example in A, the closed 2-form w? = dw! is non- 
degenerate. O 


Remark. Consider a lagrangian mechanical system with configuration manifold V and 
lagrangian function L. It is easy to see that the lagrangian “generalized velocity” q is a tan- 
gent vector to the configuration manifold V, and the “generalized momentum” p = OL/0q 
is a cotangent vector. Therefore, the “p, q” phase space of the lagrangian system is the cotangent 
bundle of the configuration manifold. The theorem above shows that the phase space of a 
mechanical problem has a natural symplectic manifold structure. 
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T*V, 


g T*V 


LAGE M 


Figure 166 The 1-form p dq on the cotangent bundle 


PRros_eM. Show that the Legendre transform does not depend on the coordinate system: it 
takes a function L: TV —> R on the tangent bundle to a function H: T*V —> R on the cotangent 
bundle. 


C Hamiltonian vector fields 


A riemannian structure on a manifold establishes an isomorphism between 
the spaces of tangent vectors and 1-forms. A symplectic structure establishes 
a similar isomorphism. 


Definition. To each vector &, tangent to a symplectic manifold (M2", w”) at 
the point x, we associate a 1-form w; on TM, by the formula 


on) = (9, &) We TMy. 


PRoBLeEM. Show that the correspondence & — «; is an isomorphism between the 2n-dimensional 
vector spaces of vectors and of 1|-forms. 


ExAmpPLe. In R2" = {(p, q)} we will identify vectors and 1-forms by using the euclidean structure 
(x, x) = p? + q?. Then the correspondence & > w; determines a transformation R2" > R?". 


PRoBLEM. Calculate the matrix of this transformation in the basis p, q. 


ae 0 E 
i Of 


We will denote by I the isomorphism I: T*M, — TM, constructed above. 
Now let H be a function on a symplectic manifold M?". Then dH is a differ- 
ential 1-form on M, and at every point there is a tangent vector to M as- 
sociated to it. In this way we obtain a vector field I dH on M. 


Definition. The vector field I dH is called a hamiltonian vector field; H is 
called the hamiltonian function. 
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ExampLe. If M?" = R?" = {(p, q)}, then we obtain the phase velocity vector field of Hamilton’s 
canonical equations: 


OH 0 
x = 1dH(x)=p = —- — and q=—. 
oq op 


38 Hamiltonian phase flows and their integral 
invariants 


Liouville’s theorem asserts that the phase flow preserves volume. Poincaré found a whole 
series of differential forms which are preserved by the hamiltonian phase flow. 


A Hamiltonian phase flows preserve the symplectic structure 


Let (M?", w*) be a symplectic manifold and H: M?" > Ra function. Assume 
that the vector field I dH corresponding to H gives a 1-parameter group of 
diffeomorphisms g': M2" + M?": 


d t. 
He | 9% =F aHOo. 


The group g’ is called the hamiltonian phase flow with hamiltonian function H. 


Theorem. A hamiltonian phase flow preserves the symplectic structure: 


(g')*w? a w?. 


In the case n = 1, M*" = R?, this theorem says that the phase flow g’ 
preserves area (Liouville’s theorem). 

For the proof of this theorem, it is useful to introduce the following nota- 
tion (Figure 167). 

Let M be an arbitrary manifold, c a k-chain on M and g': M > M a one- 
parameter family of differentiable mappings. We will construct a (k + 1)- 
chain Jc on M, which we will call the track of the chain c under the homotopy 
g,0<t<t. 

Let (D, f, Or) be one of the cells in the chain c. To this cell will be associated 
a cell (D’, f’, Or’) in the chain Jc, where D’ = I x D is the direct product of 
the interval 0<t<t and D; the mapping f’: D’->M is obtained from 
f{:D—M by the formula f'(t, x) = g‘f(x); and the orientation Or’ of the 


gic 8 
Epps Jac Jc Joc 
ac ZL ‘ : 
bo 


c 
k=2 k=] 
Figure 167 Track of a cycle under homotopy 
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space R**! containing D’ is given by the frame ey, €;,..., €,, where @p is the 
unit vector of the t axis, and e,,..., e, is an oriented frame for D. 

We could say that Jc is the chain swept out by c under the homotopy g', 
0 <t <t. The boundary of the chain Jc consists of “end-walls” made up 
of the initial and final positions of c, and “side surfaces” filled in by the 
boundary of c. 

It is easy to verify that under the choice of orientation made above, 


(1) O(Scy) = g'cy — Cy — J Ocy. 

Lemma. Let y be a 1-chain in the symplectic manifold (M2", w?). Let g' be a 
phase flow on M with hamiltonian function H. Then 

is wo? = i dH. 

dt Jy g"Y 


ProoF. It is sufficient to consider a chain y with one cell f: [0, 1] + M. We 
introduce the notation 


? i. 
red=0f, b= and n= Le TM yay, 


By the definition of the integral 


ie ge i is ar(E, nat ds. 


But by the definition of the phase flow, n is a vector (at the point f’(s, t)) of 
the hamiltonian field with hamiltonian function H. By definition of a hamil- 
tonian field, w?(E, n) = dH(&). Thus 


is w? = f (a ant)at Oo 


Corollary. If the chain y is closed (dy = 0), then Jj, w? = 0. 
Proor. |, dH = f, H = 0. oO 


PROOF OF THE THEOREM. We consider any 2-chain c. We have 


o+ aur 2 f wr 2(f -{-| wtf or fo 
Je 0JSc gc c Jd g'c c 


(1 since w? is closed, 2 by Stokes’ formula, 3 by formula (1), 4 by the corollary 
above with y = dc). Thus the integrals of the form w? on any chain c and on 
its image g‘c are the same. O 


PROBLEM. Is every one-parameter group of diffeomorphisms of M?" which preserves the sym- 
plectic structure a hamiltonian phase flow? 
Hint. Cf. Section 40. 
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B Integral invariants 


Let g: M > M bea differentiable map. 


Definition. A differential k-form w is called an integral invariant of the map g 
if the integrals of on any k-chain c and on its image under g are the same: 


ee 


EXampPLe. If M = R? and w? = dp ~ dg is the area element, then w? is an integral invariant of 
any map g with jacobian 1. 


PROBLEM. Show that a form w* is an integral invariant of a map g if and only if g*w* = w*. 


Prosem. Show that if the forms w* and w! are integral invariants of the map g, then the form 
w* ” w! is also an integral invariant of g. 


The theorem in subsection A can be formulated as follows: 


Theorem. The form w? giving the symplectic structure is an integral invariant 
of a hamiltonian phase flow. 


We now consider the exterior powers of w?, 
(w?)? = w? A w? (w*)? = w? A w? A w?,.... 
Corollary. Each of the forms (w*)*, (w*)?, (w?)4, ... is an integral invariant of a 

hamiltonian phase flow. 


PRoBLEM. Suppose that the dimension of the symplectic manifold (M2", w7) is 2n. Show that 
(w?)k = 0 for k > n, and that (w”)" is a nondegenerate 2n-form on M2". 


We define a volume element on M?" using (w*)". Then, a hamiltonian 
phase flow preserves volume, and we obtain Liouville’s theorem from the 
corollary above. 


EXAMPLE. Consider the symplectic coordinate space M7" = R?" = {(p, q)}, 
a? = dp a dq = ¥ dp; A dq;. In this case the form (w”)* is proportional to 
the form 


w*= YS dp, A+++ A dpi, A dqi, A+++ A dgi,. 
iy <s <i, 


The integral of w* is equal to the sum of the oriented volumes of projections 
onto the coordinate planes (p;,, ..-, Digs Qiys «++ > Vix): 


A map g: R?2" > R?* is called canonical if it has w? as an integral invariant. 
A canonical map is generally called a canonical transformation. Each of the 
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forms w*, w®,..., @?" is an integral invariant of every canonical transforma- 
tion. Therefore, under a canonical transformation, the sum of the oriented areas 
of projections onto the coordinate planes (D;,, .- +s Pigs Vis ++ +> Vids 1 SK SH, 
is preserved. In particular, canonical transformations preserve volume. 

The hamiltonian phase flow given by the equations p = —0dH/dq, 4 = 
0H/op consists of canonical transformations g’. 

The integral invariants considered above are also called absolute integral 
invariants. 


Definition. A differential k-form q is called a relative integral invariant of the 
map g: M > M if |,, w = |, w for every closed k-chain c. 


Theorem. Let be a relative integral invariant of a map g. Then dw is an ab- 
solute integral invariant of g. 


Proor. Let c be a k + 1-chain. Then 


faotfo2f orfo4{ ao 
c 6c géc Oge gc 


(1 and 4 are by Stokes’ formula, 2 by the definition of relative invariant, and 
3 by the definition of boundary). oO 


ExampLe. A canonical map g: R?" > R?" has the 1-form 


n 
ow! = pdq = Y p;dq; as a relative integral invariant. 
i=1 


In fact, every closed chain c on R?" is the boundary of some chain o, and we find 


5 
for] oa f ora f dos faors fot 4 for, 
ge geo ego go o Ga c 


(1 and 6 are by definition of o, 2 by definition of 6, 3 and 5 by Stokes’ formula, and 4 since g 
is canonical and dw’ = d(p dq) = dq \ dq = w?). 


Pros_em. Let dc* be an absolute integral invariant of the map g: M > M. Does it follow that 
w* is a relative integral invariant? 


ANSweR. No, if there is a closed k-chain on M which is not a boundary. 


C The law of conservation of energy 


Theorem. The function H is a first integral of the hamiltonian phase flow with 
hamiltonian function H. 


Proor. The derivative of H in the direction of a vector y is equal to the value 
of dH on 9. By definition of the hamiltonian field y = I dH we find 


dH(q) = o7(, I dH) = w7(y, 0) = 0. O 


PROBLEM. Show that the 1-form dH is an integral invariant of the phase flow with hamiltonian 
function H. 
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39 The Lie algebra of vector fields 


Every pair of vector fields on a manifold determines a new vector field, called their Poisson 
bracket.®° The Poisson bracket operation makes the vector space of infinitely differentiable 
vector fields on a manifold into a Lie algebra. 


A Lie algebras 


One example of a Lie algebra is a three-dimensional oriented euclidean 
vector space equipped with the operation of vector multiplication. The 
vector product is bilinear, skew-symmetric, and satisfies the Jacobi identity 


Definition. A Lie algebra is a vector space L, together with a bilinear skew- 
symmetric operation L x L > L which satisfies the Jacobi identity. 


The operation is usually denoted by square brackets and called the 
commutator. 


PROBLEM. Show that the set ofn x n matrices becomes a Lie algebra if we define the commutator 
by [A, B] = AB — BA. 


B Vector fields and differential operators 


Let M be a smooth manifold and A a smooth vector field on M: at every 
point xe M we are given a tangent vector A(x)¢ TM,. With every such 
vector field we associate the following two objects: 


1. The one-parameter group of diffeomorphisms or flow A‘: M > M for which 
A is the velocity vector field (Figure 168):°! 


d 


at A'x = A(x). 


t=0 


2. The first-order differential operator Ly. We refer here to the differentiation 
of functions in the direction of the field A: for any function g:M—>R 
the derivative in the direction of A is a new function L,4@, whose value 
at a point x is 


@(A'x). 


t=0 


d 
(La¢)(x) = - 


60 Or Lie bracket [Trans. note]. 


61 By theorems of existence, uniqueness, and differentiability in the theory of ordinary dif- 
ferential equations, the group 4’ is defined if the manifold M is compact. In the general case 
the maps 4! are defined only in a neighborhood of x and only for small t; this is enough for the 
following constructions. 


208 


39: The Lie algebra of vector fields 


Figure 168 The group of diffeomorphisms given by a vector field 


PROBLEM. Show that the operator L, is linear: 
La(Ai@y + Ar @2) = AL ai + A2La G2 (Ay, a2 € R). 
Also, prove Leibniz’s formula La(@192) = @:La@z2 + P2LaQ1.- 
EXAMPLE. Let (x,,.:., X,) be local coordinates on M. In this coordinate system the vector A(x) 


is given by its components (A,(x), ..., 4,(x)); the flow A' is given by the system of differential 
equations 


X = Ay(x), ..., %_ = A,(X) 


and, therefore, the derivative of @ = (x,,...,X,) in the direction A is 
10) lO) 
Lag = Ayo +7: + An. 
ae : Ox, OX, 

We could say that in the coordinates (x,,..., x,) the operator L, has the form 

L A : + A d 

=A, +---+4,—. 
‘ "ax, "Ox, 


this is the general form of a first-order linear differential operator on coordinate space. 


PROBLEM. Show that the correspondences between vector fields A, flows A’, and differentiations 
Ly, are one-to-one. 


C The Poisson bracket of vector fields 


Suppose that we are given two vector fields A and B on a manifold M. The 
corresponding flows A‘ and B* do not, in general, commute: A'B’ 4 BSA’ 
(Figure 169). 


PRoBLEM. Find an example. 
Solution. The fields A = e,, B = x,e, on the (x,, x2) plane. 


Bx 


A'BSx 
BSA'x 
B 
A 
x 
A'x 


Figure 169 Non-commutative flows 
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To measure the degree of noncommutativity of the two flows A‘ and B* we 
consider the points A‘B*x and B*A'x. In order to estimate the difference 
between these points, we compare the value at them of some smooth function 
g on the manifold M. The difference 


A(t; s; x) = p(A‘B’x) — @(B*A‘x) 


is clearly a differentiable function which is zero for s = 0 and for t = 0. 
Therefore, the first term different from 0 in the Taylor series in s and t of A 
at O contains st, and the other terms of second order vanish. We will calculate 
this principal bilinear term of A at 0. 


Lemma 1. The mixed partial derivative 6*A/ds 6t at 0 is equal to the com- 
mutator of differentiation in the directions A and B: 
eo? 
Os Ot 
Proor. By the definition of L,, 


{9(A'B*x) — 9(B*A'x)} = (Lp Lag — La Lp) (x). 


s=t=0 


=| A'Bx) = (La 9)(B*x): 
t=0 


If we denote the function L, @ by y, then by the definition of Lg 


0 
Os ie W(BSx) = (Lew)x. 
Thus, 
2 
ae t psy) — 
Osdt lees B x) (Le Ly @)x. O 


We now consider the commutator of differentiation operators LgL,4 — 
La Lg. At first glance this is a second-order differential operator. 


Lemma 2. The operator LgLa — La Lg is a first-order linear differential 
operator. 


Proor. Let (A,,...,A,) and (B,,..., B,) be the components of the fields 


A and B in the local coordinate system (x,,..., X,) on M. Then 
. 0 fay : 0A; 0 u Ho 
= pe ae B.A. 
Pola 2B ax; j= LA f ax,” i ee Ax; ax,” ue Ax; Ox; 


If we subtract L,L,@, the term with the second derivatives of @ vanishes, 
and we obtain 


O 


n 0A; 0B,\ do 
(LgLa — LyLs)o = 2 (2, ix, — A; 1) . 


i,j 
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Since every first-order linear differential operator is given by a vector 
field, our operator LgL, — L, Lg also corresponds to some vector field C. 


Definition. The Poisson bracket or commutator of two vector fields A and 
B ona manifold M° is the vector field C for which 


Le = Lp La =m Ly Lg. 
The Poisson bracket of two vector fields is denoted by 
C = [A, B]. 


PROBLEM. Suppose that the vector fields A and B are given by their components A,, B; in coor- 
dinates x;. Find the components of the Poisson bracket. 
Solution. In the proof of Lemma 2 we proved the formula 


n 0A; OB, 
‘ABS 5 Ba 
[A.B]; 2X Ox; ex; 


ProsLeM. Let A, be the linear vector field of velocities of a rigid body rotating with angular 
velocity @, around 0, and A, the same thing with angular velocity w,. Find the Poisson bracket 
(Ay, Ag]. 


D The Jacobi identity 


Theorem. The Poisson bracket makes the vector space of vector fields on a 
manifold M into a Lie algebra. 


Proor. Linearity and skew-symmetry of the Poisson bracket are clear. We 
will prove the Jacobi identity. By definition of Poisson bracket, we have 


Lya.B},C) = Le Lta,B} ~ La, B) Le 
= LeLgla oe. Le Ly, Lg + Ly LgLle = Lglale. 
There will be 12 terms in all in the sum Lya,py,c) + Lyp.c),a) + Lye, ae): 
O 


Each term appears in the sum twice, with opposite signs. 


E A condition for the commutativity of flows 

Let A and B be vector fields on a manifold M. 

Theorem. The two flows A‘ and B* commute if and only if the Poisson bracket 
of the corresponding vector fields [A, B] is equal to zero. 


Proor. If A‘B*’ = B*A', then [A, B] = 0 by Lemma 1. If [A, B] = 0, then, 
by Lemma 1, 


¢(A'BSx) — o(B°A'x) = o(s? +17),  s—7+Oandt>0 


©? In many books the bracket is given the opposite sign. Our sign agrees with the sign of the 
commutator in the theory of Lie groups (cf. subsection F). 
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for any function ¢ at any point x. We will show that this implies g(A'B*x) = 
¢(B*A‘x) for sufficiently small s and t. If we apply this to the local coordinates 
(9 = X1,...,@ = X,), We obtain A'BS = BSA‘. 


Consider the rectangle 0 < t < tg, 0 < 5 < So (Figure 170) in the ¢, s-plane. To every path 
going from (0, 0) to (to, so) and consisting of a finite number of intervals in the coordinate direc- 
tions, we associate a product of transformations of the flows A‘ and B®’. Namely, to each interval 
t; <t <t, we associate A‘’?~"', and to each interval s; < s < s, we associate B*~*'; the trans- 
formations are applied in the order in which the intervals occur in the path, beginning at (0, 0). 
For example, the sides (0 < t < to, s = 0) and (t = tg, 0 < s < 59) corresponds to the product 
B°° A", and the sides (t = 0,0 < s < so) and (s = 59,0 <t < to) to the product A’B™. 


BA toy 


Figure 171 Curvilinear quadrilateral Bydex 


In addition, we associate to each such path in the (t,s)-plane a path on the manifold M 
starting at the point x and composed of trajectories of the flows A‘ and B® (Figure 171). If a 
path in the (¢, s)-plane corresponds to the product A"'B* --- AB’, then on the manifold M 
the corresponding path ends at the point A" B® --- A'"B*x. Our goal will be to show that all 
these paths actually terminate at the one point A’ Bx = BY Ax, 

We partition the intervals 0 < t < tg and 0 < s < sy into N equal parts, so that the whole 
rectangle is divided into N? small rectangles. The passage from the sides (0, 0) — (to, 0) — (to, So) 
to the sides (0, 0) — (0, 59) — (to, 59) can be accomplished in N? steps, in each of which a pair 
of neighboring sides of a small rectangle is exchanged for the other pair (Figure 172). In general, 
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Figure 172 Going from one pair of sides to the other. 


this small rectangle corresponds to a non-closed curvilinear quadrilateral Bydex on the manifold 
M (Figure 171). Consider the distance? between its vertices x and B corresponding to the largest 
values of s and rt. As we saw earlier, p(x, B) < C,N~ > (where the constant C, > 0 does not 
depend on N). Using the theorem of the differentiability of solutions of differential equations 
with respect to the initial data, it is not difficult to derive from this a bound on the distance 
between the ends a’ and f of the paths x5)Bf’ and xdeaa’ on M: p(a’, B') < C,N7~3, where the 
constant C, > 0 again does not depend on N. But we broke up the whole journey from B*°A'x 
to A'B*x into N? such pieces. Thus, p(A‘°B%x, B°A'x) < N?C,N~3WN. Therefore, 
A’ Bx = Be A'x, Oo 


F Appendix: Lie algebras and Lie groups 


A Lie group is a group G which is a differentiable manifold, and for which the 
operations (product and inverse) are differentiable maps G x G — G and 
GG. 

The tangent space, TG,, to a Lie group G at the identity has a natural 
Lie algebra structure; it is defined as follows: 

For each tangent vector A € TG, there is a one-parameter subgroup A' ¢ G 
with velocity vector A = (d/dt)|,- 9A‘. 

The degree of non-commutativity of two subgroups A‘ and B' is measured 
by the product A'B*A~‘B-*. It turns out that there is one and only one 
subgroup C’ for which 


p(A'BSA~'B~S, C*) = o(s? + t?) assandt—0. 


The corresponding vector C = (d/dr)|,.9C’ is called the Lie bracket 
C = [A, B] of the vectors A and B. It can be verified that the operation of 
Lie bracket introduced in this way makes the space TG, into a Lie algebra 
(i.e., the operation is bilinear, skew-symmetric, and satisfies the Jacobi 
identity). This algebra is called the Lie algebra of the Lie group G. 


PRoBLEM. Compute the bracket operation in the Lie algebra of the group SO(3) of rotations in 
three-dimensional euclidean space. 


Lemma 1 shows that the Poisson bracket of vector fields can be defined 
as the Lie bracket for the “infinite-dimensional Lie group” of all diffeo- 
morphisms of the manifold M. 


©3 In some riemannian metric on M. 
4 Our choice of sign in the definition of Poisson bracket was determined by this correspondence. 
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On the other hand, the Lie bracket can be defined using the Poisson 
bracket of vector fields on a Lie group G. Let g € G. Right translation R, is 
the map R,:G—G, R,h = hg. The differential of R, at the point e maps 
TG, into TG,. In this way, every vector Ae TG, corresponds to a vector 
field on the group: it consists of the right translations (R,), A and is called a 
right-invariant vector field. Clearly, a right-invariant vector field on a group 
is uniquely determined by its value at the identity. 


PROBLEM. Show that the Poisson bracket of right-invariant vector fields on a 
Lie group G is a right-invariant vector field, and its value at the identity of 
the group is equal to the Lie bracket of the values of the original vector fields 
at the identity. 


40 The Lie algebra of hamiltonian functions 


The hamiltonian vector fields on a symplectic manifold form a subalgebra of the Lie algebra of 
all fields. The hamiltonian functions also form a Lie algebra: the operation in this algebra is 
called the Poisson bracket of functions. The first integrals of a hamiltonian phase flow form a 
subalgebra of the Lie algebra of hamiltonian functions. 


A The Poisson bracket of two functions 


Let (M2", w”) be a symplectic manifold. To a given function H: M2" R 
on the symplectic manifold there corresponds a one-parameter group 
gis: M2" + M?" of canonical transformations of M?"—the phase flow of the 
hamiltonian function equal to H. Let F: M?" > R be another function on M2". 


Definition. The Poisson bracket (F, H) of functions F and -H given on a 
symplectic manifold (M?", w*) is the derivative of the function F in the 
direction of the phase flow with hamiltonian function H: 


FHV) = 5] Gna). 


t 


Thus, the Poisson bracket of two functions on M is again a function on M. 


Corollary 1. A function F is a first integral of the phase flow with hamiltonian 
function H if and only if its Poisson bracket with H is identically zero: 
(F, H) =0. 


We can give the definition of Poisson bracket in a slightly different form 
if we use the isomorphism I between 1-forms and vector fields on a symplectic 
manifold (M?", w). This isomorphism is defined by the relation (cf. Section 
37) 


o7(n, Io!) = w'(n). 
The velocity vector of the phase flow gj, is I dH. This implies 
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Corollary 2. The Poisson bracket of the functions F and H is equal to the 
value of the 1-form dF on the velocity vector I dH of the phase flow with 
hamiltonian function H: 


(F, H) = dF(I dH). 


Using the preceding formula again, we obtain 


Corollary 3. The Poisson bracket of the functions F and H is equal to the 
“skew scalar product” of the velocity vectors of the phase flows with hamil- 
tonian functions H and F: 


(F, H) = w?(I dH, I dF). 


It is now clear that 


Corollary 4. The Poisson bracket of the functions F and H is a skew-symmetric 
bilinear function of F and H: 


and 
(A, A, Fy + A, Fy) = 4,(H, F,) + 42(H, F2) (A; € R). 


Although the arguments above are obvious, they lead to nontrivial 
deductions, including the following generalization of a theorem of E. Noether. 


Theorem. If a hamiltonian function H on a symplectic manifold (M?", w”) 
admits the one-parameter group of canonical transformations given by a 
hamiltonian F, then F is a first integral of the system with hamiltonian 
function H. 


Proor. Since H is a first integral of the flow g}, (H, F) = 0 (Corollary 1). 
Therefore, (F, H) = 0 (Corollary 4) and F is a first integral (Corollary 1). 0 


PROBLEM |. Compute the Poisson bracket of two functions F and H in the canonical coordinate 
space R?" = {(p, q)}, w7(E, n) = (IE, 9). 
Solution. By Corollary 3 we have 


" GH OF GH OF 
MAYS 2 ae aye ag Bu 
i=1 ODj OF; qi OD, 


(we use the fact that / is symplectic and has the form 
ie (; a) 
E 0 


PROBLEM 2. Compute the Poisson brackets of the basic functions p; and qj. 
Solution. The gradients of the basic functions form a “symplectic basis”: their skew-scalar 
products are 


in the basis (p, q)). 


(pis Pi) = (Ris 4j) = (9i.9j)) = 9 fi FS) — (Qip) = —(pi. 4) = 1. 
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ProsLeM 3. Show that the map A: R?" > R" sending (p, q) — (P(p, q), Q(p, q)) is canonical if 
and only if the Poisson brackets of any two functions in the variables (p, q) and (P, Q) coincide: 
0H OF OHOF 0H OF 0H OF 
op oq «0g Op )S=— OP COQ’. GQ OP 


(F, H)p.q = (F, H)p.g. 


Solution. Let A be canonical. Then the symplectic structures dp 4 dq and dP ~ dQ coincide. 
But the definition of the Poisson bracket (F, H) was given invariantly in terms of the symplectic 
structure; it did not involve the coordinates. Therefore, 


(F, H)p.q = (F, H) = (F, H)p.g. 
Conversely, suppose that the Poisson brackets (P;, Q;)p,q have the standard form of Problem 2. 
Then, clearly, dP , dQ = dp ” dq, i.e., the map A is canonical. 
PROBLEM 4. Show that the Poisson bracket of a product can be calculated by Leibniz’s rule: 
(F,F2, H) = F,(F2, H) + F2(F,, 8). 
Hint. The Poisson bracket (FF, H) is the derivative of the product FF, in the direction 
of the field I dH. 


B The Jacobi identity 


Theorem. The Poisson bracket of three functions A, B, and C satisfies the 
Jacobi identity: 


((A, B), C) + (B, C), A) + ((C, A), B) = 0. 

Corollary (Poisson’s theorem). The Poisson bracket of two first integrals 

F ,, F2 of a system with hamiltonian function H is again a first integral. 
PROOF OF THE COROLLARY. By the Jacobi identity, 

((F;, F2), H) = (Fi, (F2, H)) + (F2, CH, F1)) = 0 + 9, 

as was to be shown. O 

In this way, by knowing two first integrals we can find a third, fourth, etc. 
by a simple computation. Of course, not all the integrals we get will be 
essentially new, since there cannot be more than 2n independent functions 


on M?". Sometimes we may get functions of old integrals or constants, 
which may be zero. But sometimes we do obtain new integrals. 


ProsLeM. Calculate the Poisson brackets of the components p,, p2, p3, M1, M2, M3 of the 
linear and angular momentum vectors of a mechanical system. 


ANSWER. (M,, M2) = M3, (Mj, pi) = 0, (My, p2) = p3, (Mi, p3) = —P2. This implies 


Theorem. If two components, M, and M, of the angular momentum of some mechanical problem 
are conserved, then the third component is also conserved. 


PROOF OF THE JACOBI IDENTITY. Consider the sum 
((A, B), C) + ((B, C), A) + (C, A), B). 
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This sum is a “linear combination of second partial derivatives” of the 
functions A, B, and C. We will compute the terms in the second derivatives 
of A: 


((A, B), C) a ((C, A), B) = (Lc Lg _ Lg Leo)A, 


where L, is differentiation in the direction of § and F is the hamiltonian 
field with hamiltonian function F. 

But, by Lemma 2, Section 39, the commutator of the differentiations 
LeLg — LgLg¢ is a first-order differential operator. This means that none 
of the second derivatives of A are contained in our sum. The same thing is 
true for the second derivatives of B and C. Therefore, the sum is zero. © 


Corollary 5. Let B and C be hamiltonian fields with hamiltonian functions 
B and C. Consider the Poisson bracket [B,C] of the vector fields. This 
vector field is hamiltonian, and its hamiltonian function is equal to the 
Poisson bracket of the hamiltonian functions (B, C). 


Proor. Set (B,C) = D. The Jacobi identity can be rewritten in the form 
(A, D) = ((A, B), C) — (A, C), B), 
Lp = Le Lg — Lg Le Lp = Lp, 
as was to be shown. O 
C The Lie algebras of hamiltonian fields, 
hamiltonian functions, and first integrals 


A linear subspace of a Lie algebra is called a subalgebra if the commutator 
of any two elements of the subspace belongs to it. A subalgebra of a Lie 
algebra is itself a Lie algebra. The preceding corollary implies, in particular, 


Corollary 6. The hamiltonian vector fields on a symplectic manifold form a 
subalgebra of the Lie algebra of all vector fields. 


Poisson’s theorem on first integrals can be re-formulated as 


Corollary 7. The first integrals of a hamiltonian phase flow form a subalgebra 
of the Lie algebra of all functions. 


The Lie algebra of hamiltonian functions can be mapped naturally onto 
the Lie algebra of hamiltonian vector fields. To do this, to every function H 
we associate the hamiltonian vector field H with hamiltonian function H. 


Corollary 8. The map of the Lie algebra of functions onto the Lie algebra of 
hamiltonian fields is an algebra homomorphism. Its kernel consists of the 
locally constant functions. If M2" is connected, the kernel is one-dimensional 
and consists of constants. 
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ProoF. Our map is linear. Corollary 5 says that our map carries the Poisson 
bracket of functions into the Poisson bracket of vector fields. The kernel 
consists of functions H for which I dH = 0. Since I is an isomorphism, 
dH = Oand H = const. O 


Corollary 9. The phase flows with hamiltonian functions H, and H, commute 
if and only if the Poisson bracket of the functions H, and H, is (locally) 
constant. 


ProoF. By the theorem in Section 39, E, it is necessary and sufficient that 
[H,, H,] = 0, and by Corollary 8 this condition is equivalent to d(H,, H2) 
= 0. 0 


We obtain yet another generalization of E. Noether’s theorem: given a 
flow which commutes with the one under consideration, one can construct 
a first integral. 


D Locally hamiltonian vector fields 


Let (M?", w) be a symplectic manifold and g': M2" > M?" a one-parameter group of diffeo- 
morphisms preserving the symplectic structure. Will g' be a hamiltonian flow? 


EXAMPLE. Let M2" be a two-dimensional torus T?, a point of which is given by a pair of co- 
ordinates (p, qg)mod 1. Let w? be the usual area element dp A dq. Consider the family of trans- 
lations g‘(p, q) = (p + t, q) (Figure 173). The maps g’ preserve the symplectic structure (i.e., 
area). Can we find a hamiltonian function corresponding to the vector field (p = 1, q = 0)? 
If p = —0OH/dq and q = 6H/ép, we would have 0H/dp = Oand 6H/dq = —1,i¢e,H = —q+C. 
But q is only a local coordinate on T?; there is no map H: T? — R for which dH/dp = 0 and 
0H/dq = 1. Thus g' is not a hamiltonian phase flow. 


Dp 


q 
Figure 173 A locally hamiltonial field on the torus 


Definition. A locally hamiltonian vector field on a symplectic manifold (M?", w*) is the vector 
field Iw', where w! is a closed 1-form on M2". 


Locally, a closed 1-form is the differential of a function, @' = dH. However, in attempting 
to extend the function H to the whole manifold M2” we may obtain a “many-valued hamiltonian 
function,” since a closed 1-form on a non-simply-connected manifold may not be a differential 
(for example, the form dq on T?). A phase flow given by a locally hamiltonian vector field is called 
a locally hamiltonian flow. 


PROBLEM. Show that a one-parameter group of diffeomorphisms of a symplectic manifold pre- 
serves the symplectic structure if and only if it is a locally hamiltonian phase flow. 
Hint. Cf. Section 38A. 
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Prosem. Show that in the symplectic space R?", every one-parameter group of canonical 
diffeomorphisms (preserving dp 4 dq) is a hamiltonian flow. 
Hint. Every closed 1-form on R?" is the differential of a function. 


PROBLEM. Show that the locally hamiltonian vector fields form a sub-algebra of the Lie algebra 
of all vector fields. In addition, the Poisson bracket of two locally hamiltonian fields is actually 
a hamiltonian field, with a hamiltonian function uniquely®*> determined by the given fields § 
and 4 by the formula H = w7(E, n). Thus, the hamiltonian fields form an ideal in the Lie algebra 
of locally hamiltonian fields. 


41 Symplectic geometry 


A euclidean structure on a vector space is given by, a symmetric bilinear form, and a symplectic 
structure by a skew-symmetric one. The geometry of a symplectic space is different from that of 
a euclidean space, although there are many similarities. 


A Symplectic vector spaces 


Let R" be an even-dimensional vector space. 


Definition. A symplectic linear structure on R?" is a nondegenerate® bi- 
linear skew-symmetric 2-form given in R?". This form is called the 
skew-scalar product and is denoted by [&, ] = —[m, &]. The space R?", 
together with the symplectic structure [ , ], is called a symplectic vector 
space. 


EXAMPLE. Let (pj, ---, Pas Qi +++5 Gn) be coordinate functions on R2", and 
w? the form 


@? = py A Gy ++°° + Da A An: 


Since this form is nondegenerate and skew-symmetric, it can be taken for a 
skew-scalar product: [9] = w7(&,n). In this way the coordinate space 
R2" = {(p,q)} receives a symplectic structure. This structure is called the 
standard symplectic structure. In the standard symplectic structure the 
skew-scalar product of two vectors € and 4 is equal to the sum of the oriented 
areas of the parallelogram (&, ) on the n coordinate planes (p,, q;). 


Two vectors & and y in a symplectic space are called skew-orthogonal 
(& < n) if their skew-scalar product is equal to zero. 


PROBLEM. Show that & < &: every vector is skew-orthogonal to itself. 


The set of all vectors skew-orthogonal to a given vector y is called the 
skew-orthogonal complement to n. 


65 Not just up to a constant. 
°° A 2-form[ , ] on R?" is nondegenerate if ([6, n] = 0, Vn) > (6 = 0). 
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PRoBLEM. Show that the skew-orthogonal complement to y is a 2n — 1-dimensional hyperplane 
containing q. 
Hint. If all vectors were skew-orthogonal to y, then the form [ , ] would be degenerate. 


B The symplectic basis 


A euclidean structure under a suitable choice of basis (it must be ortho- 
normal) is given by a scalar product in a particular standard form. In exactly 
the same way, a symplectic structure takes the standard form indicated 
above in a suitable basis. 


PROBLEM. Find the skew-scalar product of the basis vectorse,,ande,,(i = 1 ..., n)in the example 
presented above. 
Solution. The relations 


(1) [e,,, &p,] = Le, &) = [ey,, eg] = 9 {ep,,€,] = 1 


follow from the definition of py) A q; + +--+ Da A Qn- 


We now return to the general symplectic space. 


Definition. A symplectic basis is a set of 2n vectors, e,,, e,, (i = 1,...,n) 
whose scalar products have the form (1). 


In other words, every basis vector is skew-orthogonal to all the basis 
vectors except one, associated to it; its product with the associated vector 
is equal to +1. 


Theorem. Every symplectic space has a symplectic basis. Furthermore, we can 
take any nonzero vector e for the first basis vector. 


Proor. This theorem is entirely analogous to the corresponding theorem in 
euclidean geometry and is proved in almost the same way. 

Since the vector e is not zero, there is a vector f not skew-orthogonal to it 
(the form [ , ] is nondegenerate). By choosing the length of this vector, we 
can insure that its skew-scalar product with e is equal to 1. In the case n = 1, 
the theorem is proved. 

If n > 1, consider the skew-orthogonal complement D (Figure 174) to 
the pair of vectors e, f. D is the intersection of the skew-orthogonal comple- 
ments to e and f. These two 2n — 1-dimensional spaces do not coincide, 


Figure 174 Skew-orthogonal complement 
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since e is not in the skew-orthogonal complement to f. Therefore, their inter- 
section has even dimension 2n — 2. 

We will show that D is a symplectic subspace of R", i.e., that the skew- 
scalar product [ , ] restricted to D is nondegenerate. If a vector &€ D 
were skew-orthogonal to the whole subspace D, then since it would also be 
skew-orthogonal to e and to f, § would be skew-orthogonal to R2", which 
contradicts the nondegeneracy of [ , ] on R?". Thus D?"~? is symplectic. 

Now if we adjoin the vectors e and f to a symplectic basis for D2"? we 
get a sympletic basis for R?”, and the theorem is proved by induction on n. 


O 


Corollary. All symplectic spaces of the same dimension are isomorphic. 


If we take the vectors of a symplectic basis as coordinate unit vectors, 
we obtain a coordinate system p;,q; in which [ , ] takes the standard 
form py Aq; +-°::+ DP, A qq. Such a coordinate system is called sym- 
plectic. 


C The symplectic group 


To a euclidean structure we associated the orthogonal group of linear map- 
pings which preserved the euclidean structure. In a symplectic space the 
symplectic group plays an analogous role. 


Definition. A linear transformation S:R*" > R?" of the symplectic space 
R?" to itself is called symplectic if it preserves the skew-scalar product: 


(S§, Sn] =(&.n], VE neR?” 


The set of all symplectic transformations of R?" is called the symplectic 
group and is denoted by Sp(2n). 


It is clear that the composition of two symplectic transformations is 
symplectic. To justify the term symplectic group, we must only show that a 
symplectic transformation is nonsingular; it is then clear that the inverse is 
also symplectic. 


Pros_em. Show that the group Sp(2) is isomorphic to the group of real two-by-two matrices 
with determinant 1 and is homeomorphic to the interior of a solid three-dimensional torus. 


Theorem. A transformation S:R?" + R?" of the standard symplectic space 
(P, q) is symplectic if and only if it is linear and canonical, i.e., preserves the 
differential 2-form 


w* = dp, A dq, +--- + dp, A dqn. 


Proor. Under the natural identification of the tangent space to R2" with 
R?", the 2-form w? goes to[ , ]. O 
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Corollary. The determinant of any symplectic transformation is equal to 1. 


Proor. We already know (Section 38B) that canonical maps preserve the 
exterior powers of the form w”. But its n-th exterior power is (up to a constant 
multiple) the volume element on R2". This means that symplectic trans- 
formations S of the standard R2" = {(p, q)} preserve the volume element, 
so det S = 1. But since every symplectic linear structure can be written down 
in standard form in a symplectic coordinate system, the determinant of a 
symplectic transformation of any symplectic space is equal to 1. O 


Theorem. A linear transformation S: R2" + R*" is symplectic if and only if it 
takes some (and therefore any) symplectic basis into a symplectic basis. 


Proor. The skew-scalar product of any two linear combinations of basis vec- 
tors can be expressed in terms of skew-scalar products of basis vectors. If the 
transformation does not change the skew-scalar products of basis vectors, 
then it does not change the skew-scalar products of any vectors. im) 


D Planes in symplectic space 


In a euclidean space all planes are equivalent: each of them can be carried into 
any other one by a motion. We will now look at a symplectic vector space 
from this point of view. 


PRoBLeEM. Show that a nonzero vector in a symplectic space can be carried into any other non- 
zero vector by a symplectic transformation. 


Pros_eM. Show that not every two-dimensional plane of the symplectic space R?" can be 
obtained from a given 2-plane by a symplectic transformation. 
Hint. Consider the planes (p,, pz) and (p,, q;). 


Definition. A k-dimensional plane (i.e., subspace) of a symplectic space is 
called null®’ if it is skew-orthogonal to itself, i.e., if the skew-scalar product 
of any two vectors of the plane is equal to zero. 


EXAMPLE. The coordinate plane (p,,..., p,) in the symplectic coordinate system p, q is null. 
(Prove it!) 


PROBLEM. Show that any non-null two-dimensional plane can be carried into any other non- 
null two-plane by a symplectic transformation. 


For calculations in symplectic geometry it may be useful to impose some 
euclidean structure on the symplectic space. We fix a symplectic coordinate 
system p, q and introduce a euclidean structure using the coordinate scalar 
product 


(x, x) = > p? + qs where X = », Pi€p, + Gi&g,- 


®7 Null planes are also called isotropic, and for k = n, lagrangian. 


222 


41: Symplectic geometry 


The symplectic basis e,, e, is orthonormal in this euclidean structure. The 
skew-scalar product, like every bilinear form, can be expressed in terms of 
the scalar product by 


(2) (8. n] = 8,0) 


where J: R*” + R?" is some operator. It follows from the skew-symmetry of 
the skew-scalar product that the operator / is skew-symmetric. 


PROBLEM. Compute the matrix of the operator / in the symplectic basis e,,, €,,. 


ANSWER. 


where E is the n x n identity matrix. 


Thus, for n = 1 (in the p, q-plane), J is simply rotation by 90°, and in the 
general case IJ is rotation by 90° in each of the n planes p,, q;. 


PROBLEM. Show that the operator / is symplectic and that 7 = —E,,. 


Although the euclidean structures and the operator J are not invariantly 
associated to a symplectic space, they are often convenient. 
The following theorem follows directly from (2). 


Theorem. A plane x of a symplectic space is null if and only if the plane In is 
orthogonal to T. 


Notice that the dimensions of the planes z and I are the same, since J is 
nonsingular. Hence 


Corollary. The dimension of a null plane in R?" is less than or equal to n. 


This follows since the two k-dimensional planes x and In cannot be 
orthogonal if k > n. 

We consider more carefully the n-dimensional null planes in the symplectic 
coordinate space R?". An example of such a plane is the coordinate p-plane. 
There are in all (2”) n-dimensional coordinate planes in R?" = {(p, q)}. 


PROBLEM. Show that there are 2” null planes among the (2") n-dimensional coordinate planes: 
to each of the 2” partitions of the set (1,.... n) into two parts (i,,..., igs ys ees In-K) WE aSSO- 
ciate the null coordinate plane p;,...., Pigs Gjys ones Win: 


In order to study the generating functions of canonical transformations 
we need 
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Figure 175 Construction of a coordinate plane o transversal to a given plane z. 


Theorem. Every n-dimensional null plane x in the symplectic coordinate space 
R?" is transverse®® to at least one of the 2" coordinate null planes. 


Proor. Let P be the null plane p,, ..., p, (Figure 175). Consider the inter- 
section t = 2% P. Suppose that the dimension of t is equal tok,O <k <n. 
Like every k-dimensional subspace of the n-dimensional space, the plane t is 
transverse to at least one (n — k)-dimensional coordinate plane in P, let us 
say the plane 


N= (Dis ++ +> Din yd t+tn=P,tAnn=0. 
We now consider the null n-dimensional coordinate plane 
F = (Digs s+ +s Din Vino ++ +9 Lins n=aonP, 
and show that our plane z is transverse to o: 
TA0=0. 
We have 


TOCNN~AI>t=-7 
b+ <(ana)=P <(ano) 
nCco,0o~o>n<~O 


But P is an n-dimensional null plane. Therefore, every vector skew-orthogonal 
to P belongs to P (cf. the corollary above). Thus (x 4 o) c P. Finally, 


LTAG=(LRAP)A(WOP)=tH4H =), 


as was to be shown. O 


PROBLEM. Let 2, and 7, be two k-dimensional planes in symplectic R®”. Is it always possible to 
carry m, to 2, by a symplectic transformation? How many classes of planes are there which 
cannot be carried one into another? 


Answer. [k/2] + lif k <n: [Qn — k)/2] + Lifk =n. 


E Symplectic structure and complex structure 


Since J? = —E we can introduce into our space R?” not only a symplectic 
structure [ , ] and euclidean structure ( , ), but also a complex structure, 


by defining multiplication by i = ./—1 to be the action of J. The space R?” 


68 Two subspaces L, and L, of a vector space L are transverse if L, + L, = L. Two n-dimen- 
sional planes in R?" are transverse if and only if they intersect only in 0. 
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is identified in this way with a complex space C" (the coordinate space with 
coordinates z, = p, + ig,). The linear transformations of R?" which preserve 
the euclidean structure form the orthogonal group O(2n); those preserving 
the complex structure form the complex linear group GL(n, C). 


PROBLEM. Show that transformations which are both orthogonal and symplectic are complex, 
that those which are both complex and orthogonal are symplectic, and that those which are 
both symplectic and complex are orthogonal: thus that the intersection of two of the three 
groups is equal to the intersection of all three: 


O(2n) FN Sp(2n) = Sp(2n) nN GL(n, C) = GL(n, C) m9 O(2n). 


This intersection is called the unitary group U(n). 


Unitary transformations preserve the hermitian scalar product (6, y) + 
i[&, 9]; the scalar and skew-scalar products on R?" are its real and imaginary 
parts. 


42 Parametric resonance in systems with many degrees 
of freedom 


During our investigation of oscillating systems with periodically varying parameters (cf. Section 
25), we explained that parametric resonance depends on the behavior of the eigenvalues of a 
certain linear transformation (“the mapping at a period”). The dependence consists of the fact 
that an equilibrium position of a system with periodically varying parameters is stable if the 
eigenvalues of the mapping at a period have modulus less than 1, and unstable if at least one of 
the eigenvalues has modulus greater than 1. 

The mapping at a period obtained from a system of Hamilton’s equations with periodic 
coefficients is symplectic. The investigation in Section 25 of parametric resonance in a system 
with one degree of freedom relied on our analysis of the behavior of the eigenvalues of symplectic 
transformations of the plane. In this paragraph we will analyze, in an analogous way, the behavior 
of the eigenvalues of symplectic transformations in a phase space of any dimension. The results 
of this analysis (due to M. G. Krein) can be applied to the study of conditions for the appearance 
of parametric resonance in mechanical systems with many degrees of freedom. 


A Symplectic matrices 


Consider a linear transformation of a symplectic space, S:R?" > R?". Let 
Pir -++> Pn V1» +++» Gn be a Symplectic coordinate system. In this coordinate 
system, the transformation is given by a matrix S. 


Theorem. A transformation is symplectic if and only if its matrix S in the sym- 
plectic coordinate system (p, q) satisfies the relation 


SIS = 1, 


64 


where 


and S' is the transpose of S. 
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ProoF. The condition for being symplectic ([S&, Sn] = [6, n] for all § and y) 
can be written in terms of the scalar product by using the operator I, as 
follows: 


(1S§, Sn) = (50), VEN 
or 
(S'IS§,n) = (6,9), VE,n, 
as was to be shown. | 
B Symmetry of the spectrum of a symplectic 
transformation 
Theorem. The characteristic polynomial of a symplectic transformation 
p(A) = det(S — AE) 
is reflexive,®° i.e., p(A) = A?"p(1/A). 


ProorF. We will use the facts that det S = det J = 1, 1? = —E, and det A’ = 
det A. By the theorem above, S = —JS’~ ‘I. Therefore, 


p(A) = det(S — AE) = det(—IS’~'I — AE) = det(—S'~* + AE) 
= det(— E + AS) 


= j2" aei(s e 74) = (3). Oo 


Corollary. If 4 is an eigenvalue of a symplectic transformation, then 1/A is also 
an eigenvalue. 


On the other hand, the characteristic polynomial is real; therefore, if A 
is a complex eigenvalue, then / is an eigenvalue different from A. It follows 
that the roots A of the characteristic polynomial lie symmetrically with 
respect to the real axis and to the unit circle (Figure 176). They come in 
4-tuples, 


A, 4, 5, (|A| # 1, Im 4 # 0), 


69 A reflexive polynomial is a polynomial ayx" + a,x""' +--+. + a, which has symmetric 
coefficients ag = An, Gy = Am—15-++> 
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Figure 176 Distribution of the eigenvalues of a symplectic transformation 


or on the unit circle, 


It is not hard to verify that the multiplicities of all four points of a 4-tuple (or 
both points of a pair) are the same. 


C Stability 


Definition. A transformation S is called stable if 
Ve > 0,36 > 0:|x| < 6 => |S*x| <6, VN>O. 


PROBLEM. Show that if at least one of the eigenvalues of a symplectic transformation S does not 
lie on the unit circle, then S is unstable. 

Hint. In view of the demonstrated symmetry, if one of the eigenvalues does not lie on the 
unit circle, then there exists an eigenvalue outside the unit circle |A| > 1; in the corresponding 
invariant subspace, S is an “expansion with a rotation.” 


PROBLEM. Show that if all the eigenvalues of a linear transformation are distinct and lie on the 
unit circle, then the transformation is stable. 
Hint. Change to a basis of eigenvectors. 


Definition. A symplectic transformation S is called strongly stable if every 
symplectic transformation sufficiently close’® to S is stable. 


In Section 25 we established that S: R? — R? is strongly stable if A, , = 
e* and A, # A). 


Theorem. If all 2n eigenvalues of a symplectic transformation S are distinct 
and lie on the unit circle, then S is strongly stable. 


Proor. We enclose the 2n eigenvalues A in 2n non-intersecting neighborhoods, 
symmetric with respect to the unit circle and the real axis (Figure 177). The 
2n roots of the characteristic polynomial depend continuously on the ele- 
ments of the matrix of S. Therefore, if the matrix S, is sufficiently close to S, 


70 §, is “sufficiently close” to S if the elements of the matrix of S, in a fixed basis differ from the 
elements of the matrix of S in the same basis by less than a sufficiently small number ¢. 
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Figure 177 Behavior of simple eigenvalues under a small change of the symplectic 
transformation 


exactly one eigenvalue A, of the matrix of S, will lie in each of the 2n neigh- 
borhoods of the 2n points of A. But if one of the points A, did not lie on the 
unit circle, for example, if it lay outside the unit circle, then by the theorem in 
subsection B, there would be another point A,,|A,| < 1 in the same neighbor- 
hood, and the total number of roots would be greater than 2n, which is not 


possible. 
Thus all the roots of S, lie on the unit circle and are distinct, so S, is 
stable. O 


We might say that an eigenvalue A of a symplectic transformation can 
leave the unit circle only by colliding with another eigenvalue (Figure 178); 
at the same time, the complex-conjugate eigenvalues will collide, and from 
the two pairs of roots on the unit circle we obtain one 4-tuple (or pair of 
real A). 


0 
—° 


Figure 178 Behavior of multiple eigenvalues under a small change of the symplectic 
transformation 


It follows from the results of Section 25 that the condition for parametric 
resonance to arise in a linear canonical system with a periodically changing 
hamilton function is precisely that the corresponding symplectic transforma- 
tion of phase space should cease to be stable. It is clear from the theorem 
above that this can happen only after a collision of eigenvalues on the unit 
circle. In fact, as M. G. Krein noticed, not every such collision is dangerous. 

It turns out that the eigenvalues A with |A| = 1 are divided into two classes: 
positive and negative. When two roots with the same sign collide, the roots 
“go through one another,” and cannot leave the unit circle. On the other 
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hand, when two roots with different signs collide, they generally leave the 
unit circle. 

M.G. Krein’s theory goes beyond the limits of this book ; we will formulate 
the basic results here in the form of problems. 


Prostem. Let A and J be simple (multiplicity 1) eigenvalues of a symplectic transformation S$ 
with |A| = 1. Show that the two-dimensional invariant plane 2, corresponding to A, A, is non- 
null. 

Hint. Let §, and &, be complex eigenvectors of S with eigenvalues A, and 4,. Then if 4,1, # 1, 
the vectors §, and &, are skew-orthogonal: [€,, §,] = 0. 


Let § be a real vector of the plane z,, where Im 4 > 0 and |A| = 1. The eigenvalue 2 is called 
positive if [S&, §] > 0. 


Pros_eM. Show that this definition is correct, i.e., it does not depend on the choice of § ¥ 0 in 
the plane z,. 
Hint. If the plane x, contained two non-collinear skew-orthogonal vectors, it would be null. 


In the same way, an eigenvalue A of multiplicity k with |A| = 1 is of definite sign if the quad- 
ratic form [SE, &] is (positive or negative) definite on the invariant 2k-dimensional subspace 
corresponding to A, /. 


PROBLEM. Show that S is strongly stable if and only if all the eigenvalues A lie on the unit circle 
and are of definite sign. 
Hint. The quadratic form [S&, &] is invariant with respect to S. 


43 A symplectic atlas 


In this paragraph we prove Darboux’s theorem, according to which every symplectic manifold 
has local coordinates p, q in which the symplectic structure can be written in the simplest way: 
w? = dp A dq. 


A Symplectic coordinates 


Recall that the definition of manifold includes a compatibility condition for 
the charts of an atlas. This is a condition on the maps 9; ', going from one 
chart to another. The maps 9; '@, are maps of a region of coordinate space. 


Definition. An atlas of a manifold M?" is called symplectic if the standard 
symplectic structure w* = dp ~ dq is introduced into the coordinate 
space R?" = {(p, q)}, and the transfer from one chart to another is realized 
by a canonical (i.e., w?-preserving) transformation’! 9; '9;. 


PROBLEM. Show that a symplectic atlas defines a symplectic structure on M?". 


The converse is also true: every symplectic manifold has a symplectic 
atlas. This follows from the following theorem. 


7! Complex-analytic manifolds, for example, are defined analogously; there must be a complex- 
analytic structure on coordinate space, and the transfer from one chart to another must be 
complex analytic. 
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B Darboux’s theorem 


Theorem. Let «” be a closed nondegenerate differential 2-form in a neighbor- 
hood of a point x in the space R*". Then in some neighborhood of x one can 
choose a coordinate system (p1,.-- 5 Pai 41> --+> Qn) Such that the form has the 
standard form: 


w? = dp; 0 dq. 


i=1 


This theorem allows us to extend to all symplectic manifolds any assertion 
of a local character which is invariant with respect to canonical transforma- 
tions and is proven for the standard phase space (R?", w? = dp A dq). 


C Construction of the coordinates p, and q, 


For the first coordinate p, we take a non-constant linear function (we could 
have taken any differentiable function whose differential is not zero at the 
point x). For simplicity we will assume that p,(x) = 0. 

Let P, = I dp, denote the hamiltonian field corresponding to the function 
p, (Figure 179). Note that P,(x) # 0; therefore, we can draw a hyperplane 
N?"~' through the point x which does not contain the vector P,(x) (we 
could have taken any surface transverse to P,(x) as N2"~'). 


Figure 179 Construction of symplectic coordinates 


Consider the hamiltonian flow P{, with hamiltonian function p,. We 
consider the time t necessary to go from N to the point z = P(y) (ye N) 
under the action of P‘, as a function of the point z. By the usual theorems in 
the theory of ordinary differential equations, this function is defined and 
differentiable in a neighborhood of the point x € R?". Denote it by g,. Note 
that gq; = 0on N and that the derivative of q, in the direction of the field P, 
is equal to 1. Thus the Poisson bracket of the functions g, and p, we con- 
structed is equal to 1: 


(41. Pi) = I. 


N 
w 
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D Construction of symplectic coordinates by 
induction on n 


If n = 1, the construction is finished. Let n > 1. We will assume that Dar- 
boux’s theorem is already proved for R2"~ *. Consider the set M given by the 
equations p, = q, = 0. The differentials dp, and dq, are linearly independent 
at x since w(I dp,, I dq,) = (q,, p;) = 1. Thus, by the implicit function 
theorem, the set M is a manifold of dimension 2n — 2 in a neighborhood of 
x; we will denote it by M2" *. 


Lemma. The symplectic structure w? on R?" induces a symplectic structure on 
some neighborhood of the point x on M?"~?, 


Proor. For the proof we need only the nondegeneracy of w? on TM,. 
Consider the symplectic vector space TR2". The vectors P,(x) and Q,(x) 
of the hamiltonian vector fields with hamiltonian functions p, and q, belong 
to TR2". Let &€ TM,. The derivatives of p, and q, in the direction & are 
equal to zero. This means that dp, (6) = w?(E, P,) = Oand dq,(&) = w7(& Q:) 
= 0. Thus TM, is the skew-orthogonal complement to P,(x), Q,(x). By 
Section 41B, the form w? on TM, is nondegenerate. O 


By the induction hypothesis there are symplectic coordinates in a neigh- 
borhood of the point x on the symplectic manifold (M?"~?, w?|,). Denote 
them by p,,q;(i = 2,...,n). We extend the functions p,,...,q, to a neighbor- 
hood of x in R?" in the following way. Every point z in a neighborhood of 
x in R?" can be uniquely represented in the form z= P,Qiw, where 
we M?"~? and s and t are small numbers. We set the values of the coor- 
dinates p,,..., q, at z equal to their values at the point w (Figure 179). The 
2n functions p,,..-, Pas G1. -++> 4, thus constructed form a local coordinate 
system in a neighborhood of x in R?". 


E Proof that the coordinates constructed are 
symplectic 


Denote by P; and Qi (i = 1,..., n) the hamiltonian flows with hamiltonian 
functions p; and q,;, and by P; and Q; the corresponding vector fields. We will 
compute the Poisson brackets of the functions p,,...,q,- We already saw in 
C that (q,, p,;) = 1. Therefore, the flows P', and Q commute: PQ} = Q{P\. 

Recalling the definitions of p,,..., q, we see that each of these functions is 
invariant with respect to the flows P and Q{. Thus the Poisson brackets of 
Pp, and q, with all 2n — 2 functions p;, q; (i > 1) are equal to zero. 

The map P‘Q\ therefore commutes with all 2n — 2 flows Pt, Qf (i > 1). 
Consequently, it leaves each of the 2n — 2 vector fields P;, Q; (i > 1) fixed. 
P',Q\, preserves the symplectic structure w* since the flows P', and QS are 
hamiltonian; therefore, the values of the form w? on the vectors of any two 
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of the 2n — 2 fields P;, Q; (i > 1) are the same at the points z = P,Q we R”” 
and we M?"~ 2. But these values are equal to the values of the Poisson brack- 
ets of the corresponding hamiltonian functions. Thus, the values of the 
Poisson bracket of any two of the 2n — 2 coordinates p;, q,; (i > 1) at the 
points z and w are the same if z = P{Q{w. 

The functions p, and q, are first integrals of each of the 2n — 2 flows 
P', Q$ (i > 1). Therefore, each of the 2n — 2 fields P;, Q; is tangent to the 
level manifold p, = q, = 0. But this manifold is M?""?. Therefore, each of 
the 2n — 2 fields P;, Q; (i > 1) is tangent to M?""?. Consequently, these 
fields are hamiltonian fields on the symplectic manifold (M?"~ 7, w?|y), and 
the corresponding hamiltonian functions are p;|y, qi (i > 1). Thus, in the 
whole space (R2", w?), the Poisson bracket of any two of the 2n — 2 co- 
ordinates p;, q; (i > 1) considered on M?"~? is the same as the Poisson 
bracket of these coordinates in the symplectic space (M?"" *, w”|y). 

But, by our induction hypothesis, the coordinates on M?"~? (p;ly, dilus 
i > 1) are symplectic. Therefore, in the whole space R2", the Poisson brackets 
of the constructed coordinates have the standard values 


(Dis Pj) = (Dis 4j) = (Gis Qj) =9 and (q;, p;) = 1. 


The Poisson brackets of the coordinates p, q on R?" have the same form if 
w? = ) dp; A dq;. But a bilinear form w? is determined by its values on 
pairs of basis vectors. Therefore, the Poisson brackets of the coordinate 
functions determine the shape of wm? uniquely. Thus 


w? = dp, A dq, +--+ dp, A» dq,, 


and Darboux’s theorem is proved. O 
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The coordinate point of view will predominate in this chapter. The technique 
of generating functions for canonical transformations, developed by 
Hamilton and Jacobi, is the most powerful method available for integrating 
the differential equations of dynamics. In addition to this technique, the 
chapter contains an “odd-dimensional” approach to hamiltonian phase 
flows. 

This chapter is independent of the previous one. It contains new proofs 
of several of the results in Chapter 8, as well as an explanation of the origin 
of the theory of symplectic manifolds. 


44 The integral invariant of Poincaré—Cartan 


In this section we look at the geometry of 1-forms in an odd-dimensional space. 


A A hydrodynamical lemma 


Let v be a vector field in three-dimensional oriented euclidean space R?, 
and r = curl vy its curl. The integral curves of r are called vortex lines. If y, 
is any closed curve in R? (Figure 180), the vortex lines passing through the 
points of y, form a tube called a vortex tube. 

Let y, be another curve encircling the same vortex tube, so that y; — y, = 
do, where o is a 2-cycle representing a part of the vortex tube. Then: 


Stokes’ lemma. The field v has equal circulation along the curves y, and y2: 


f vdl= f val. 
VA. y2 
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Figure 180 Vortex tube 


Proor. By Stokes’ formula, |, v dl — J, v dl = ff, curl v dn = 0, since curl v 
is tangent to the vortex tube. oO 


B The multi-dimensional Stokes’ lemma 


It turns out that Stokes’ lemma generalizes to the case of any odd-dimensional 
manifold M2"*! (in place of R). To formulate this generalization we replace 
our vector field by a differential form. 

The circulation of a vector field v is the integral of the 1-form w! 
(w1(&) = (v, &)). To the curl of v there corresponds the 2-form w? = dw! 
(dw'(&, n) = (r, &, )). It is clear from these formulas that there is a direction 


n 


Figure 181 Axis invariantly connected with a 2-form in an odd-dimensional space 


at every point (namely, the direction of r, Figure 181), having the property 
that the circulation of v along the boundary of every “infinitesimal square” 
containing r is equal to zero: 


dwi(r,n)=0, Vn. 


In fact, dw'(r, n) = (1, r,) = 0. 

Remark. Passing from the 2-form w* = da! to the vector field r = curl v 
is not an invariant operation: it depends on the euclidean structure of R°. 
Only the direction”? of r is invariantly associated with w* (and, therefore, 
with the 1-form q'). It is easy to verify that, if r 0, then the direction of r 
is uniquely determined by the condition that w(r, n) = 0 for all y, 


72 Te., the unoriented line in TR? with direction vector r. 
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The algebraic basis for the multi-dimensional Stokes’ lemma is the 
existence of an axis for every rotation of an odd-dimensional space. 


Lemma. Let w? be an exterior algebraic 2-form on the odd-dimensional vector 
space R?"*!. Then there is a vector & # 0 such that 


wor(—,n)=0, Vue Rt) 
Proor. A skew-symmetric form w? is given by a skew-symmetric matrix A 
w°(&, n) = (AB, 0) 
of odd order 2n + 1. The determinant of such a matrix is equal to zero, since 
A'=—A_ det A = det A’ = det(—A) = (—1)?"*! det A = —det A. 


Thus the determinant of A is zero. This means A has an eigenvector & # 0 
with eigenvalue 0, as was to be shown. O 


A vector & for which w(&, 9) = 0, Vy is called a null vector for the form w?. 
The null vectors of w? clearly form a linear subspace. The form «7 is called 
nonsingular if the dimension of this space is the minimal possible (i.e., 1 
for an odd-dimensional space R?"*! or 0 for an even-dimensional space). 


Prose. Consider the 2-form w? = dp, - dq, + +--+ dp, A dq, on an even-dimensional 
space R?" with coordinates py, ..., Pai is +--+ Gn Show that w? is nonsingular. 


PROBLEM. On an odd-dimensional space R?"** with coordinates p,...., Pai is +++> Init, COn- 
sider the 2-form w? = ¥ dp; A dq; — w' A dt, where w! is any 1-form on R2"* !. Show that w? is 
nonsingular. 


If w? is a nonsingular form on an odd-dimensional space R?"*', then 
the null vectors & of w? all lie on a line. This line is invariantly associated to 
the form w?. 

Now let M?"*! be an odd-dimensional differentiable manifold and w! 
a 1-form on M. By the lemma above, at every point x € M there is a direction 
(i.e., a straight line {c§} in the tangent space TM,) having the property 
that the integral of w’ along the boundary of an “infinitesimal square 
containing this direction” is equal to zero: 


doi(~n)=0, Vue TM,. 


Suppose further that the 2-form dw’ is nonsingular. Then the direction € 
is uniquely determined. We call it the “vortex direction” of the form w!. 
The integral curves of the field of vortex directions are called the vortex 
lines (or characteristic lines) of the form w!. 
Let y, be a closed curve on M. The vortex lines going out from points 
of y, form a “vortex tube.” We have 
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The multi-dimensional Stokes’ lemma. The integrals of a1-form @’ along any 
two curves encircling the same vortex tube are the same: },, 0' = §,,0', 
ify; — Y2 = 00, where o is a piece of the vortex tube. 


Proor. By Stokes’ formula 


f wt =p ot = | ot = | dor. 
v1 72 oa a 


But the value of dw! on any pair of vectors tangent to the vortex tube is equal 
to zero. (These two vectors lie in a 2-plane containing the vortex direction, 
and dw! vanishes on this plane.) Thus, [, dw! = 0. i) 


C Hamilton’s equations 


All the basic propositions of hamiltonian mechanics follow directly from 
Stokes’ lemma. 

For M2"*! we will take the “extended phase space R?"*'” with co- 
ordinates pi,..-, Pni41>--->4n3t. Suppose we are given a function H = 
H(p, q, t). Then we can construct’? the 1-form 


w'=pdq—Hdt (pdq = p; dq; + --- + Py 4q,). 
We apply Stokes’ lemma to w! (Figure 182). 


q 


(-Hq; Hp, 1) 


t 
Figure 182 Hamiltonian field and vortex lines of the form p dq — H dt. 


Theorem. The vortex lines of the form w' = pdq — Hat on the 2n + 1- 
dimensional extended phase space p, q, t have a one-to-one projection onto 
the t axis, i.e., they are given by functions p = p(t),q = q(t). These functions 
satisfy the system of canonical differential equations with hamiltonian 
function H: 

(1) ap. ..got ao 
t Op 


In other words, the vortex lines of the form p dq — H dt are the trajectories 
of the phase flow in the extended phase space, i.e., the integral curves of the 
canonical equations (1). 


73 The form w! seems here to appear out of thin air. In the following paragraph we will see how 
the idea of using this form arose from optics. 
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Proor. The differential of the form p dq — H dt is equal to 


n 


dot = ¥) 


i=1 


oH OH 
dp; nea cyst 7 ; 
( Dp: \ dq; ap, dp; \ dt a, dq; \ a) 
It is clear from this expression that the matrix of the 2-form da! in the 
coordinates p, q, t has the form 


0 -E 4H, 
A=| E 0 Al, 
—H, —H, 0 
where 
1 
ie a a 0H y oH 
at 7 2 Pp op’ 1” éq 
UY 


(verify this !). 

The rank of this matrix is 2n (the upper left 2n-corner is non-degenerate); 
therefore, dw! is nonsingular. It can be verified directly that the vector 
(—H,, H,, 1) is an eigenvector of A with eigenvalue 0 (do it!). This means 
that it gives the direction of the vortex lines of the form pdq — H dt. But the 
vector (— H,, H,, 1) is also the velocity vector of the phase flow of (1). Thus 
the integral curves of (1) are the vortex lines of the form p dq — H dt, as was 
to be shown. O 


D A theorem on the integral invariant of 
Poincaré—Cartan 


We now apply Stokes’ lemma. We obtain the fundamental 


Theorem. Suppose that the two curves y, and y, encircle the same tube of 
phase trajectories of (1). Then the integrals of the form pdq — H dt along 
them are the same: 


f pdq—Har= > paq— Hae 
V1 y2 


The form p dq — H dtis called the integral invariant of Poincaré-Cartan."* 


Proor. The phase trajectories are the vortex lines of the form p dq — H dt, 
and the integrals along closed curves contained in the same vortex tube are 
the same by Stokes’ lemma. | 


74 In the calculus of variations | p dq — H dt is called Hilbert’s invariant integral. 
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in 
BY 


to ft) : 


Figure 183 Poincaré’s integral invariant 


We will consider, in particular, curves consisting of simultaneous states, 
i.e., lying in the planes t = const (Figure 183). Along such curves, dt = 0 
and § pdq — H dt = $ pdgq. From the preceding theorem we obtain the 
important: 


Corollary 1. The phase flow preserves the integral of the form pdq = 
Pi dq, +--- + Pp, dq, on closed curves. 


ProoF. Let gi}: R?" > R" be the transformation of the phase space (p, q) 
realized by the phase flow from time fo to f, (1.€., g,(Po. qo) iS the solution 
to the canonical equations (1) with initial conditions p(t) = Po, (to) = qo)- 
Let y be any closed curve in the space Rc R7"*! (¢ = to). Then gity 
is a closed curve in the space R2" (t = t,), contained in the same tube of 
phase trajectories in R7"*'. Since dt = 0 on y and on gi'y we find by the 
preceding theorem that {,, p dq = tii p dq, as was to be shown. O 

The form p dq is called Poincaré’s relative integral invariant. It has a 
simple geometric meaning. Let o be a two-dimensional oriented chain and 
y = 0a. Then, by Stokes’ formula, we find 


p pag = || dp / dq. 
y a 


Thus we have proved the important: 


Corollary 2. The phase flow preserves the sum of the oriented areas of the 
projections of a surface onto the n coordinate planes (p;, qj): 


i) dp 4 dq = ff. dp A dq. 
o 9ye 


In other words, the 2-form w? = dp A dq is an absolute integral invariant 
of the phase flow. 


EXAMPLE. For n = 1, w? is area, and we obtain Liouville’s theorem: the 
phase flow preserves area. 
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E Canonical transformations 
Let g be a differentiable mapping of the phase space R?2" = {(p, q)} to R2”. 


Definition. The mapping g is called canonical, or a canonical transformation, 
if g preserves the 2-form w? = )' dp; A dqj. 


It is clear from the argument above that this definition can be written 
in any of three equivalent forms: 


1. g*w? = w? (g preserves the 2-form > dp; A dqj); 

2. ff, ©? = Jao, Vo (g preserves the sum of the areas of the projections 
of any surface); 

3. §,pdq = §,, p dq (the form p dq is a relative integral invariant of g). 


PROBLEM. Show that definitions (1) and (2) are equivalent to (3) if the domain of the map in 
question is a simply connected region in the phase space R?"; in the general case 3 > 2 <> 1. 


The corollaries above can now be formulated as: 


Theorem. The transformation of phase space induced by the phase flow is 
canonical.’* 


Let g: R*" > R?" be a canonical transformation: g preserves the form w?. 
Then g also preserves the exterior square of w?: 
g*(w? A w*) = @* Aw? and g*(w?)k = (w)*. 
The exterior powers of the form )° dp; A dq; are proportional to the forms 


w* = ¥ dp; dp; 0 dq; ® dq; 


i<j 


wz = Y dpi, A+++ A dp, A dqi, A+ A ddqi,. 
iy <i <i¢ 


Thus we have proved 


Theorem. Canonical transformations preserve the integral invariants 


w*,..., 02", 


Geometrically, the integral of the form w** is the sum of the oriented 
volumes of the projections onto the coordinate planes (p;,,... Piys i>» «+> Vix): 
In particular, w" is proportional to the volume element, and we obtain: 


Corollary. Canonical transformations preserve the volume element in phase 
space: 
the volume of gD is equal to the volume of D, for any region D. 


75 The proof of this theorem which is presented in the excellent book by Landau and Lifshitz 
(Mechanics, Pergamon, Oxford, 1960) is incorrect. 
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In particular, applying this to the phase flow we obtain 


Corollary. The phase flow (1) has as integral invariants the forms 
w*, w*,..., 02", 


The last of these invariants is the phase volume, so we have again proved 
Liouville’s theorem. 


45 Applications of the integral invariant of 

Poincaré—Cartan 
In this paragraph we prove that canonical transformations preserve the form of Hamilton’s 
equations, that a first integral of Hamilton’s equations allows us to reduce immediately the order 


of the system by two and that motion in a natural lagrangian system proceeds along geodesics 
of the configuration space provided with a certain riemannian metric. 


A. Changes of variables in the canonical equations 


The invariant nature of the connection between the form p dq — H dt and 
its curl lines gives rise to a way of writing the equations of motion in any 
system of 2n + 1 coordinates in extended phase space {(p, q, t)}. 


P.4q;t X15+++X2n+1 
AMX |S 
Figure 184 Change of variables in Hamilton’s equations 


Let (x1,..., X2n41) be coordinate functions in some chart of extended 
phase space (considered as a manifold M?"*', Figure 184). The coordinates 
(p, q, t) can be considered as giving another chart on M. The form w! = 
p dq — H dt can be considered as a differential 1-form on M. Invariantly 
associated (not depending on the chart) to this form is a family of lines on M— 
the vortex lines. In the chart (p, q, t), these lines are represented as the tra- 
jectories of the phase flow 


dp_ oH dy_ otf 
(1) dt dq dt Op 


with hamiltonian function H(p, q, t). 
Suppose that in the coordinates (x,,..., X2,41) the form w! is written as 


pdq—Hadat=X, dx, tee Xon+1 AXan4 4: 
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Theorem. In the chart (x;), the trajectories of (1) are represented by the vortex 
lines of the form. X; dx;. 


Proor. The curl lines of the forms )’ X; dx; and p dq — H dt are the images 
in two different charts of the vortex lines of the same form on M. But the 
integral curves of (1) are the vortex lines of p dq — H dt. Thus, their images 
in the chart (x;) are the vortex lines of the form y X; dx;. O 


Corollary. Let (P,,...,P,;Q;,...,Q,; T) be a coordinate system on the 
extended phase space (p,q,t) and K(P,QT) and S(P,Q,T) functions 
such that ; 


pdq — H dt = PdQ— KdT +dS 


(the left- and right-hand sides are forms on extended phase space). 
Then the trajectories of the phase flow (1) are represented in the chart 
(P, Q, T) by the integral curves of the canonical equations 


dP ~=—s OK dQ_ OK 

dT 6Q aT GP 

ProoF. By the theorem above, the trajectories of (1) are represented by the 
vortex lines of the form PdQ — K dT + dS. But dS has no influence on 
the vortex lines (since ddS = 0). Therefore, the images of the trajectories of (1) 
are the vortex lines of the form P dQ — K dT. According to Section 44C, 
the vortex lines of such a form are integral curves of the canonical equations 


(2). O 


(2) 


In particular, let g: R?" + R?" be a canonical transformation of phase 
space taking a point with coordinates (p,q) to a point with coordinates 
(P, Q). The functions P(p, q) and Q(p, q) can be considered as new co- 
ordinates on phase space. 


Theorem. In the new coordinates (P,Q) the canonical equations (1) have 
the canonical form’® 


GB dpP_ _9K 4Q_ 0K 
) dt dQ dt oP 


with the same hamiltonian function: K(P, Q, t) = H(p, 4, t). 


7° In some textbooks the property of preserving the canonical form of Hamilton's equations is 
taken as the definition of a canonical transformation. This definition is not equivalent to the 
generally accepted one mentioned above. For example, the transformation P = 2p, Q = q, 
which is not canonical by our definition, preserves the hamiltonian form of the equations of 
motion. This confusion appears even in the excellent textbook by Landau and Lifshitz (Mechanics, 
Oxford, Pergamon, 1960); in Section 45 of this book they show that every transformation which 
preserves the canonical equations is canonical in our sense. 
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a 


P1.41 


Po, 90 
p 
Figure 185 Closedness of the form p dq — P dQ 


Proor. Consider the 1-form p dq — P dQ on R*". For any closed curve y 
we have (Figure 185) 


f Pda ~ PdQ = f pg - pPaQ=0 


since g is canonical. Therefore, {.’4, p dq — P dQ = S does not depend on 


the path of integration but only on the endpoint (p,, q,) (for a fixed initial 
point (Pp, qo)). Thus dS = pdq — P dQ. Consequently, in the extended 
phase space, we have 


pdq— Hat =PdQ-Hdt+ds. 
Thus, the theorem above is applicable, and (2) is transformed to (3). Oo 


Prose. Let g(r): R?" + R?" be a canonical transformation of phase space depending on the 
parameter t, g(t)(p, q) = (P(p, q, ¢), Q(p, q, t)). Show that in the variables P, Q, t the canonical 
equations (1) have the canonical form with new hamiltonian function 


os 
K(P, Q, t) = H(p, q, t) Tape 


where 


Pi, 


a 
S(Pi, 41,0) = f pdq — PdQ (dQ for fixed t) 


Po. qo 


B Reduction of order using the energy integral 


Suppose now that the hamiltonian function H(p, q) does not depend on time. 
Then the canonical equations (1) have a first integral: H(p(t), q(t)) = const. 
It turns out that by using this integral we can reduce the dimension (2n + 1) 
of the extended phase space by two, thereby reducing the problem to in- 
tegration of a system of canonical equations in a (2n — 1)-dimensional space. 

We assume that (in some region) the equation h = H(p,,...,Daiis-++>4n) 
can be solved for p: 


| K(P, Q, T; h), 
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where P = (p2,..., Dn); Q = (G2, -- +> Gn)i3 T = —G,. Then we find 
p dq — H dt = PdQ — K dT — d(At) + t dH. 


Now let y be an integral curve of the canonical equations (1) lying on the 
2n-dimensional surface H(p, q) = h in R?"*'. Then y is a vortex line of the 
form p dq — H dt (Figure 186). We project the extended phase space R2"*! = 
{(p, q, t)} onto the phase space R?” = {(p,q)}. The surface H = h is pro- 
jected onto a (2n — 1)-dimensional manifold M?"~!: H(p,q) = h in R”, 
and y is projected to a curve 7 lying on this submanifold. The variables 
P, Q, T form local coordinates on M?"~}, 


Figure 186 Lowering the order of a hamiltonian system 


PROBLEM. Show that the curve 7 is a vortex line of the form p dq = P dQ — K dT on M2"~!, 
Hint. d(Ht) does not affect the vortex lines, and dH is zero on M. 


But the vortex lines of P dQ — K dT satisfy Hamilton’s equations (2). 
Thus we have proved 


Theorem. The phase trajectories of the equations (1) on the surface M?"~}, 
H = h, satisfy the canonical equations 


dp, OK dq; OK : 
—_ = —_=-s, i=2,...,n), 
dq, 04; dq, Op; ( 


where the function K(p2,..., Dns 42> +++» ns T, h) is defined by the equation 
ACK, pz, ---5 Dns —T da5--+5 Qn) =h. 


C The principle of least action in phase space 


In the extended phase space {(p, q, t)}, we consider an integral curve of the 
canonical equations (1) connecting the points (Po, qo, to) and (pj, q;, t)). 


Theorem. The integral | p dq — H dt has y as an extremal under variations 
of y for which the ends of the curve remain in the n-dimensional subspaces 
(¢ = to, 4 = Qo) and (t = t1,q4 = q,). 


Proor. The curve y is a vortex line of the form p dq — H dt (Figure 187). 
Therefore, the integral of p dq — H dt over an “infinitely small parallelogram 
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- 3 


= 1541 
. to, 40 7 
t 


Figure 187 Principle of least action in phase space 


passing through the vortex direction” is equal to zero. In other words, the 
increment |, — J, pdq — H dt is small to a higher order in comparison with 
the difference of the curves y and y’, as was to be shown. 

If this argument does not seem rigorous enough, it can be replaced by the 
computation 


oH oH 
5 foi — rae = f (aon + 1 — op - Fa 
4 ) \aeP poq ap aq 


oH a) 
+ i — — ]op — (p + — ]oq [at. 
0 { |( ee ( oq 4 


We see that the integral curves of Hamilton’s equations are the only 
extremals of the integral { p dq — H dr in the class of curves y whose ends 
lie in the n-dimensional subspaces (t = to,q = Qo) and (t = t;,q = q;) 
of extended phase space. 


= poq 


Remark. The principle of least action in Hamilton's form is a particular case of the principle 
considered above. Along extremals, we have 


dy ty ty 
| pdq—Har= [ (pa Hyar= [Lae 
to to 


to, do 


(since the lagrangian L and the hamiltonian H are Legendre transforms of one another). Now 
let (Figure 188) be the projection of the extremal y onto the q, t plane. To any nearby curve 7’ 
connecting the same points (to, qo) and (t,,q,) in the q, t plane we associate a curve 3" in the 


Pp 
4 


Figure 188 Comparison curves for the principles of least action in the configuration 
and phase spaces 
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phase space (p, q, t) by setting p = 0L/0q. Then, along ’, too, f, pdq — H dt = f;, L dt. But 
by the theorem above, 6 f, p dq — H dt = Ofor any variation curve y (with boundary conditions 
(t = to, q = Qo) and (t = ¢,, q = q,). In particular, this is true for variations of the special form 
taking y to y’. Thus y is an extremal of f L dt, as was to be shown. 


In the theorem above we are allowed to compare y with a significantly 
wider class of curves y’ than in Hamilton’s principle: there are no restrictions 
placed on the relation of p with q. Surprisingly, one can show that the two 
principles are nevertheless equivalent: an extremal in the narrower class of 
variations (p = 0L/0q) is an extremal under all variations. The explana- 
tion is that, for fixed q, the value p = 0L/dq is an extremal of pq — H (cf. the 
definition of the Legendre transform, Section 14). 


D The principle of least action in the 
Maupertuis—Euler-Lagrange—Jacobi form 


Suppose now that the hamiltonian function H(p, q) does not depend on time. 
Then H(p, q) is a first integral of Hamilton’s equations (1). We project the 
surface H(p, q) = h from the extended phase space {(p, q, t)} to the space 
{(p, q)}. We obtain a (2n — 1)-dimensional surface H(p,q) =h in R?", 
which we already studied in subsection B and which we denoted by M?""?. 

The phase trajectories of the canonical equations (1) beginning on the 
surface M?"~! lie entirely in M?"~'. They are the vortex lines of the form 
p dq = P dQ — KT (in the notation of B) on M?"~!. By the theorem in 
subsection C, the curves (1) on M?"~! are extremals for the variational 
principle corresponding to this form. Therefore, we have proved 


Theorem. If the hamiltonian function H = H(p, q) does not depend on time, 
then the phase trajectories of the canonical equations (1) lying on the surface 
M?"~': H(p, q) = h are extremals of the integral | p dq in the class of 
curves lying on M?"~' and connecting the subspaces q = Qo and q = q,. 


We now consider the projection onto the q-space of an extremal lying 
on the surface M?"~!: H(p, q) = h. This curve connects the points qy and 


q,:. Let y be another curve connecting the points qy and q, (Figure 189). 
The curve y is the projection of some curve 7 on M?"~1. Specifically, we 


p 


19 


Figure 189 Maupertuis’ principle 
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parametrize y by t, a < t < b, y(a) = qo, y(b) = qy. Then at every point q 
of y there is a velocity vector q = dy(t)/dt, and the corresponding momentum 
p = OL/0q. If the parameter t is chosen so that H(p, q) = h, then we obtain 
a curve 9:q = »(t), p = OL/dq on the surface M?""'. Applying the theorem 
above to the curve on M?"~!, we obtain 


Corollary. Among all curves q = y(t) connecting the two points qo and q, on 
the plane q and parametrized so that the hamiltonian function has a fixed 
value H(éL/0q, q) = h, the trajectory of the equations of dynamics (1) is 
an extremal of the integral of “reduced action” 


[ paa= [rade = [ Soa ae 


This is also the principle of least action of Maupertuis (Euler-Lagrange- 
Jacobi).”’ It is important to note that the interval a < t < b parametrizing 
the curve y is not fixed and can be different for different curves being com- 
pared. On the other hand, the energy (the hamiltonian function) must be 
the same. We note also that the principle determines the shape of a trajectory 
but not the time: in order to determine the time we must use the energy 
constant. . 

The principle above takes a particularly simple form in the case when the 
system represents inertial motion on a smooth manifold. 


Theorem. A point mass confined to a smooth riemannian manifold moves along 
geodesic lines (i.e., along extremals of the length | ds). 


Proor. In this case, 
1 (ds\? oL ds\? 
— —— => — | — = ¢ = 2 => rss i 
H=L=T 5(¢) and aq 4 T (7) 


Therefore, in order to guarantee a fixed value of H = h, the parameter must 


be chosen proportional to the length dt = ds/,./2h. The reduced action 
integral is then equal to 


{ eb qdt = [ V2mas = JK | as 
¢ oq Y y 
therefore, extremals are geodesics of our manifold. O 


In the case when there is a potential energy, the trajectories of the equa- 
tions of dynamics are also geodesics in a certain riemannian metric. 


77 “Tn almost all textbooks, even the best, this principle is presented so that it is impossible to 
understand.” (C. Jacobi, Lectures on Dynamics, 1842-1843). I do not choose to break with 
tradition. A very interesting “proof” of Maupertuis’ principle is in Section 44 of the mechanics 
textbook of Landau and Lifshitz (Mechanics, Oxford, Pergamon, 1960). 
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Let ds? be a riemannian metric on configuration space which gives the 
kinetic energy (so that T = 4(ds/dz)”). Let h be a constant. 


Theorem. In the region of configuration space where U(q) < h we define 
a riemannian metric by the formula 


dp = ./h — Uq) ds. 


Then the trajectories of the system with kinetic energy T = 4(ds/dt)’, 
potential energy U(q), and total energy h will be geodesic lines of the metric 
dp. 


Proor. In this cae L=T—U, H=T+U, and (6L/0q)q = 2T = 
(ds/dt)? = 2(h — U). Therefore, in order to guarantee a fixed value of 
H =h, the parameter t must be chosen proportional to length: dt = 
ds/,/2(h — U). The reduced action integral will then be equal to 


qa = [vm Das = 2 a. 


By Maupertuis’ principle, the trajectories are geodesics in the metric dp, 
as was to be shown. O 


Remark 1. The metric dp is obtained from ds by a “stretching” depending 
on the point q but not depending on the direction. Therefore, angles in the 
metric dp are the same as angles in the metric ds. On the boundary of the 
region U <h the metric dp has a singularity: the closer we come to the 
boundary, the smaller the p-length becomes. In particular, the length of any 
curve lying in the boundary (U = h) is equal to zero. 


Remark 2. If the initial and endpoints of a geodesic y are sufficiently close, 
then the extremum of length is a minimum. This justifies the name “ principle 
of least action.” In general, an extremum of the action is not necessarily a 
minimum, as we see by considering geodesics on the unit sphere (Figure 190). 
Every arc of a great circle is a geodesic, but only those with length less than 
are minimal: the arc NS’M is shorter than the great circle arc NSM. 


s 
Figure 190 Non-minimal geodesic 
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Remark 3. If his larger than the maximum value of U on the configuration 
space, then the metric dp has no singularities; therefore, we can apply 
topological theorems about geodesics on riemannian manifolds to the study 
of mechanical systems. For example, we consider the torus T? with some 
riemannian metric. Among all closed curves on T? making m rotations 


Figure 191 Periodic motion of a double pendulum 


around the parallel and n around the meridian, there exists a curve of shortest 
length (Figure 191). This curve is a closed geodesic (for a proof see books 
on the calculus of variations or “Morse theory”). On the other hand, the 
torus T? is the configuration space of a planar double pendulum. Therefore, 


Theorem. For any integers m and n there is a periodic motion of the double 
pendulum under which one segment makes m rotations while the other 
segment makes n rotations. 


Furthermore, such periodic motions exist for any sufficiently large values 
of the constant h (h must be larger than the potential energy at the highest 
position). 

As a last example we consider a rigid body fastened at a stationary point 
and located in an arbitrary potential field. The configuration space (SO(3)) 
is not simply connected: there exist non-contractible curves in it. The above 
arguments imply 


Theorem. In any potential force field, there exists at least one periodic motion 
of the body. Furthermore, there exist periodic motions for which the total 
energy h is arbitrarily large. 


46 Huygens’ principle 


The fundamental notions of hamiltonian mechanics (momenta, the hamiltonian function H, 
the form p dq — H dt and the Hamilton-Jacobi equations, all of which we will be concerned 
with below) arose by the transforming of several very simple and natural notions of geometric 
optics, guided by a particular variational principle—that of Fermat, into general variational 
principles (and in particular into Hamilton's principle of stationary action, 6 { Ldt = 0). 
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A Wave fronts 


We consider briefly ’* the fundamental notions of geometric optics. According 
to the extremal principle of Fermat, light travels from a point qo to a point 
q, in the shortest possible time. The speed of the light can depend both on the 
point q (an “inhomogeneous medium”) and on the direction of the ray 
(in an “anisotropic medium,” such as a crystal). The characteristics of a 
medium can be described by giving a surface (the “indicatrix”) in the tangent 
space at each point q. To do this, we take in every direction the velocity vector 
of the propagation of light at the given point in the given direction (Figure 


192). 
See 


i 


Figure 192 An anisotropic, inhomogeneous medium 


Now let t > 0. We look at the set of all points q to which light from a given 
point q, can travel in time less than or equal to t. The boundary of this set, 
®, ,(t), is called the wave front of the point qy after time t and consists of points 
to which light can travel in time t and not faster. 

There is a remarkable relation, discovered by Huygens, between the wave 
fronts corresponding to different values of t. (Figure 193) 


Huygens’ theorem. Let ©, (t) be the wave front of the point qo after time t. 
For every point q of this front, consider the wave front after time s, ®,(s). 
Then the wave front of the point qo after time s + t, ®,,(s + t), will be the 
envelope of the fronts ®,(s), q € ®,,(t). 


Proor. Let q,,,€ ®,,(t + s). Then there exists a path from qp to q,,, along 
which the time of travel of light equals ¢ + s, and there is none shorter. We 
look at the point q, on this path, to which light travels in time t. No shorter 
path from qp to q, can exist; otherwise, the path q,q,,, would not be the 
shortest. Therefore, the point q, lies on the front ®,,(t). In exactly the same 
way light travels the path q,q,,, in time s, and there is no shorter path from 
q, to q,,,. Therefore, the point q,,, lies on the front of the point q, at time s, 
®,,(s). We will show that the fronts ®,(s) and ®,,(t + s) are tangent. In 


78 We will not pursue rigor here, and will assume that all determinants are different from zero, 
etc. The proofs of the subsequent theorems do not depend on the semi-heuristic arguments of 
this paragraph. It should be noted that the appropriate lagrangian for geometric optics is 
homogeneous of order 1 in the velocities. To apply the Legendre transform, and to make the 
analogy with mechanics in the following section, we should square this lagrangian, which does 
not affect the indicatrix surface where the value is 1. In fact, the real meaning of Huygens’ principle 
is best expressed in contact geometry (see Appendix 4 or the author’s Singularities of Caustics 
and Wave Fronts, Kluwer 1990). 
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Figure 193 Envelope of wave fronts 


fact, if they crossed each other (Figure 194), then it would be possible to 
reach some points of ®, (t + s) from q, in time less than s, and therefore 
from qo in time less than s + t. This contradicts the definition of ®, (t + s); 
and so the fronts ®,(s) and ®, (t + s) are tangent at the point q,,,, as was 
to be proved. oO 


Gas 


®, (s) %, (s +1) 
t 


Figure 194 Proof of Huygens’ theorem 


The theorem which has been proved is called Huygens’ principle. It is 
clear that the point q 9 could be replaced by a curve, surface, or, in general, 
by a closed set, the three-dimensional space {q} by any smooth manifold, 
and propagation of light by the propagation of any disturbance transmitting 
itself “locally.” 

Huygens’ principle reduces to two descriptions of the process of prop- 
agation. First, we can trace the rays, i.e., the shortest paths of the propagation 
of light. In this case the local character of the propagation is given by a 
velocity vector q. If the direction of the ray is known, then the magnitude 
of the velocity vector is given by the characteristics of the medium (the 
indicatrix). 

On the other hand, we can trace the wave fronts. Assuming that we are 
given a riemannian metric on the space {q}, we can talk about the velocity 
of motion of the wave front. We look, for example, at the propagation of 
light in a medium filling ordinary euclidean space. Then one can characterize 
the motion of the wave front by a vector p perpendicular to the front, which 
will be constructed in the following manner. 
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Direction of the ray 


p=grad S$ 
Direction of motion 
of the front 


Front 
Sq.(q) = ¢ 


Figure 195 Direction of a ray and direction of motion of the wave front 


For every point qo we define the function S,,(q) as the optical length of 
the path from qo to q, i.e., the least time of the propagation of light from qo 
to q. The level set {q: S,,(q) = t} is nothing other than the wave front ®,,(t) 
(Figure 195). The gradient of the function S (in the sense of the metric 
mentioned above) is perpendicular to the wave front and characterizes the 
motion of the wave front. In this connection, the bigger the gradient, the 
slower the front moves. Therefore, Hamilton called the vector 


the vector of normal slowness of the front. 

The direction of the ray q and the direction of motion of the front p do not 
coincide in an anisotropic medium. However, they are related to one another 
by a simple relationship, easily derived from Huygens’ principle. Recall 
that the characteristics of the medium are at every point described by a 
surface of velocity vectors of light—the indicatrix. 


Definition. The direction of the hyperplane tangent to the indicatrix at the 
point v is called conjugate to the direction v (Figure 196). 


Theorem. The direction of the wave front ®,,(t) at the point q, is conjugate 
to the direction of the ray q. 


Proor. We look (Figure 197) at points q, of the ray qgq,,0 <1 <t. Takee 
very small. Then the front ®, (¢) differs by quantities of order O(e?) from 
the indicatrix at the point q,, contracted by ¢. By Huygens’ principle, this 
front ®,,_(<) is tangent to the front ®,,(t) at the point q,. Passing to the limit 
as € > 0, we obtain the theorem. 0 


Conjugate 
direction 


Figure 196 Conjugate hyperplane 
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q(t) 


Indicatrix of 


the point 4% Direction of the ray 


Direction of motion 
of the front 


Front ®4,(t) 


Figure 197 Conjugacy of the direction of a wave and of the front 


If the auxiliary metric used to define the vector p is changed, the natural 
velocity of the motion of the front, ie. both the magnitude and direction of 
the vector p, will be changed. However, the differential form pdq = dS 
on the space {q} = R® is defined in a way which is independent of the 
auxiliary metric; its value depends only on the chosen fronts (or rays). On the 
hyperplane conjugate to the velocity vector of a ray, this form is equal to 
zero, and its value on the velocity vector is equal to 1.79 


B The optical-mechanical analogy 


We return now to mechanics. Here the trajectories of motion are also 
extremals of a variational principle, and one can construct mechanics as 
the geometric optics of a many-dimensional space, as Hamilton did; we will 
not develop this construction in full detail, but will only enumerate those 
optical concepts which led Hamilton to basic mechanical concepts. 


Optics Mechanics 

Optical medium Extended configuration space {(q, t)} 
Fermat’s principle Hamilton’s principle 5 { L dt = 0 
Rays Trajectories q(t) 

Indicatrices Lagrangian L 

Normal slowness vector p Momentum p 


of the front 

Expression of pin terms of | Legendre transformation 
the velocity of the ray, q 

1-form p dq 1-form p dq — H dt 


79 Tn this way, the vectors p corresponding to various fronts passing through a given point are not 
arbitrary, but are subject to one condition: the permissible values of p fill a hypersurface in 
{p}-space which is dual to the indicatrix of velocities. : 
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The optical length of the path S,,(q) and Huygens’ principle have not yet 
been used. Their mechanical analogues are the action function and the 
Hamilton-Jacobi equation, to which we now turn. 


C Action as a function of coordinates and time 


Definition. The action function S(q, t) is the integral 


Sao.to@ t) = [ Ladt 
Y 
along the extremal y connecting the points (qo, to) and (q, t). 


In order for this definition to be correct, we must take several precautions: 
we must require that the extremals going from the point (qo, to) do not inter- 
sect elsewhere, but instead form a so-called “central field of extremals” 
(Figure 198). More precisely, we associate to every pair (qo, t) a point (q, t) 
which is the end of the extremal with initial condition q(0) = qo, (0) = qo. 
We say that an extremal y is contained in a central field if the mapping 
(qo, t) > (q, t) is nondegenerate (at the point corresponding to the extremal 
y under consideration, and therefore in some neighborhood of it). 


q 


tq 


to, 40 


> 


Figure 198 A central field of extremals ° 


It can be shown that for |t — t9| small enough the extremal is contained in 
a central field.°° 

We now look at a sufficiently small neighborhood of the endpoint (q, t) 
of our extremal. Every point of this neighborhood is connected to (qo, to) 
by a unique extremal of the central field under consideration. This extremal 
depends differentiably on the endpoint (q, t). Therefore, in the indicated 
neighborhood the action function is correctly defined 


Sao,to4> t) = i) L dt. 
iy 


In geometric optics we were looking at the differential of the optical 
length of a path. It is natural here to look at the differential of the action 
function. 


8° ProsLem. Show that this is not true for large t — to. Hint. G= —q (Figure 199). 
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Figure 199 Extremal with a focal point which is not contained in any central field 


Theorem. The differential of the action function (for a fixed initial point) is 
equal to 


dS = pdq — Hat 


where p = 0L/0q and H = pq — Lare defined with the help of the terminal 
velocity q of the trajectory y. 


Proor. We lift every extremal from (q, t)-space to the extended phase space 
{(p, q, t)}, setting p = 0L/04, i.e., replacing the extremal by a phase trajectory. 
We then get an n + 1-dimensional manifold in the extended phase space 
consisting of phase trajectories, ie, characteristic curves of the form 
p dq — H dt. We now give the endpoint (q, t) an increment (Aq, At), and 
consider the set of extremals connecting (qo, to) with points of the segment 
q + 0Aq,t + OAt,0 < 0 < 1 (Figure 200). In phase space we get a quadrangle 
o composed of characteristic curves of the form p dq — H dt, the boundary 
of which consists of two phase trajectories y, and y,, a segment of a curve « 
lying in the space (q = qo, t = to), and a segment of a curve f projecting 
to the segment (Aq, At). Since o consists of characteristic curves of the 
form p dq — H dt, we have 


o= [[ apdq— Hat) = | pdq— Hat 
o do 


eee 
om y2 B a 


But, on the segment «, we have dq = 0, dt = 0. On the phase trajectories y, and 
2,p dq — H dt = Ldt (Section 45C). So, the difference j,, — |,, pdq — Hdt 


Figure 200 Calculation of the differential of the action function 


254 


46: Huygens’ principle 


is equal to the increase of the action function, and we find 
[p dq — H dt = S(q + Aq, t + At) — S(q, t). 
If now Aq — 0, At — 0, then 
Jp dq — H dt = pAq — HAt + o(At, Aq) 
which proves the theorem. O 


The form p dq — H dt was formerly introduced to us artificially. We see 
now, by carrying out the optical-mechanical analogue, that it arises from 
examining the action function corresponding to the optical length of a path. 


D The Hamilton-Jacobi equation 


Recall that the “vector of normal slowness p” cannot be altogether arbitrary: 
it is subject to one condition, pq = 1, following from Huygens’ principle. 
An analogous condition restricts the gradient of the action function S. 


Theorem. The action function satisfies the equation 


os os 
(1) ae as q, ) = 0. 


This nonlinear first-order partial differential equation is called the 
Hamilton-Jacobi equation. 


PROOF, It is sufficient to notice that, by the previous theorem, 


os 


P~ aq O 


os 
es — H(p, q, t) 


The relation just established between trajectories of mechanical systems 
(“rays”) and partial differential equations (“wave fronts”) can be used in 
two directions. 

First, solutions of Equation (1) can be used for integrating the ordinary 
differential equations of dynamics. Jacobi’s method of integrating Hamilton’s 
canonical equations, presented in the next section, consists of just this. 

Second, the relation of the ray and wave points of view allows one to 
reduce integration of the partial differential equations (1) to integration 
of a hamiltonian system of ordinary differential equations. 

Let us go into this in a little more detail. For the Hamilton-Jacobi 
equation (1), the Cauchy problem is 


as os 
(2) S(q, to) = So(q) a H (F. q, ) = 0. 
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In order to construct a solution to this problem, we look at the hamiltonian 
system 


_ 6H _ OH 
~ aq a= ap ' 
We consider the initial conditions (Figure 201): 
OSo 
to) = t) ==] . 
Mto) =o — PlEo) 4 |e, 


The solution corresponding to these equations is represented in (q, t)-space 
by the curve q = q(t), which is the extremal of the principle 6 f Lat =0 
(where the lagrangian L(q, q, t) is the Legendre transformation with respect 
to p of the hamiltonian function H(p,q,t)). This extremal is called the 
characteristic of problem (2), emanating from the point qo. 

If the value t, is sufficiently close to tg, then the characteristics emanating 
from points close to q) do not intersect for tp <t <t,, | —qo| < R. 
Furthermore, the values of qg and t can be taken as coordinates for points 
in the region |q — qox| < R, to <t <t, (Figure 201). 


to ty t2 3 
Figure 201 Characteristics for a solution of Cauchy’s problem for the Hamilton- 
Jacobi equation 


We now construct the “action function with initial condition S$)”: 


A 


(3) S(A) = So(@o) + L(q, 4, t)dt 


qo, to 


(integrating along the characteristic leading to A). 


Theorem. The function (3) is a solution of problem (2). 


Proor. The initial condition is clearly fulfilled. The fact that the Hamilton- 
Jacobi equation is satisfied is verified just as in the theorem on differentials 
of action functions (Figure 202). 


By Stokes’ lemma, {,, ~ J,, + Jy — J, pdq — H dt = 0. But on a, Hdt = Oand p = 6S,/dq, 
so 


| dq— Hdt= [p dq = | dS = Soo + AG) — So(Go). 
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Figure 202 The action function as a solution of the Hamilton-Jacobi equation 


Further, y,; and y, are phase trajectories, so 


J 


va, 


pdq— Har = | L dt. 


So 
fe dq—- Hat= [sua + Aq) + | Lat| - [soca + i Lar| 
B 72 "1 
= S(A + AA) — S(A). 
For At, Aq > 0, we get 0S/dt = —H, 0S/dq = p, which proves the theorem. Oo 


PROBLEM. Show the uniqueness of the solution to problem (2). 
Hint. Differentiate S along the characteristics. 


PROBLEM. Solve the Cauchy problem (2) for 


PROBLEM. Draw a graph of the multiple-valued “functions” S(q) and p(q) for t = t3 (Figure 201). 


ANSWER. Cf. Figure 203. 


Figure 203 A typical singularity of a solution of the Hamilton-Jacobi equation 
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The point of self-intersection of the graph of S corresponds on the graph of p to the Maxwell 
line: the shaded areas are equal. The graph of S(q, t) has a singularity called a swallowtail at the 
point (0, t,). 


47 The Hamilton-Jacobi method for integrating 
Hamilton’s canonical equations 


In this paragraph we define the generating function of a free canonical transformation. 


The idea of the Hamilton-Jacobi method consists of the following. Under 
canonical changes of coordinates, the canonical form of the equations of 
motion is preserved, as is the hamiltonian function (Section 45A). Therefore, 
if we succeed in finding a canonical transformation which reduces the 
hamiltonian function to a form such that the canonical equations can be 
integrated, then we can also integrate the original canonical equations. It 
turns out that the problem of constructing such a canonical transformation 
reduces to the determination of a sufficiently large number of solutions to 
the Hamilton-Jacobi partial differential equation. The generating function 
of the desired canonical transformation must satisfy this equation. 

Before turning to the apparatus of generating functions, we remark 
that it is unfortunately noninvariant and it uses, in an essential way, the co- 
ordinate structure in phase space {(p, q)}. It is necessary to use the apparatus 
of partial derivatives, in which even the notation is ambiguous.°! 


A Generating functions 


Suppose that the 2n functions P(p, q) and Q(p, q) of the 2n variables p and q 
give a canonical transformation g: R*" + R". Then the 1-form p dq — P dQ 
is an exact differential (Section 45A): 


(1) p dq — P dQ = dS(p, q). 


PROBLEM. Show the converse: if this form is an exact differential, then the transformation is 
canonical. 


We now assume that, in a neighborhood of some point (po, qo), we can 
take (Q, q) as independent coordinates. In other words, we assume that 
the following jacobian is not zero at (po, qo): 


Qa), 2Q 
(p,q) a ép 


det # 0. 


8! It is important to note that the quantity du/@x on the x, y-plane depends not only on the 
function which is taken for x, but also on the choice of the function y: in new variables (x, z) 
the value of du/0x will be different. One should write 


ou ou 
Ox Ox 


y= const z=const 
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Such canonical transformations will be called free. In this case, the function S$ 
can be expressed locally in these coordinates: 


S(p, q) = S,(Q, q). 


Definition. The function S,(Q,q) is called a generating function of our 
canonical transformation g. 


We emphasize that S, is not a function on the phase space R": it is a 
function on a region in the direct product Rq x Rg of two n-dimensional 
coordinate spaces, whose points are denoted by q and Q. It follows from (1) 
that the “partial derivatives” of S, are 


0S ,(Q, 0S ,(Q, 
ie Dang S q) _ 
Conversely, every function S, gives a canonical transformation g by 
formulas (2). 


(2) —P. 


Theorem. Let S,(Q, q) be a function given on a neighborhood of some point 
(Qo; Go) of the direct product of two n-dimensional euclidean spaces. If 


as, 
et ——— 
6Q oq Qo, ao 


then S, is a generating function of some free canonical transformation. 


d #0, 


Proor. Consider the equation for the Q coordinates: 
AS,(Q.4) _ 
oq 


By the implicit function theorem this equation can be solved to determine a 
function Q(p, q) in a neighborhood of the point 


(1. aos (Ae: *) ) 
Qo, 40, 


Pp. 


oq 
(with Q(po, Go) = Qo). In fact, the determinant we need here is 
07S ,(Q, a) 
det ve SLA, 
: ( dQ aq 


and this is different from zero by hypothesis. 
We now consider the function 


> 
Qo, 40 


PQa) = ~ 5 5i1Q.4), 


and set 
P(p, q) = P, (Q(p, q), 9). 
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Then the local map g: R?" > R?" sending the point (p,q) to the point 
(P(p, 4), Q(p, q)) will be canonical with generating function S,, since by 
construction 

0S ,(Q, q) d 0S,(Q, q) 


aq q+ 30 dQ. 


It is free, since det(@Q/dp) = det(d?S ,(Q, q)/0Q dq)" # 0. im 


pdq— PdQ= 


The transformation g: R*" > R?" is given in general by 2n functions of 
2n variables. We see that a canonical transformation is given entirely by 
one function of 2n variables—its generating function. It is easy to see how 
useful generating functions are in all calculations related to canonical trans- 
formations. This becomes even more so as the number of variables, 2n, 
becomes large. 


B The Hamilton—Jacobi equation for generating functions 


We notice that canonical equations in which the hamiltonian function 
depends only on the variable Q are easy to integrate. If H = K(Q, t), then the 
canonical equations have the form 


: ‘ OK 
(3) Q=0 P= 20 
from which we have immediately 
'OK 
Q=Q0) PH=PO+{ SE dt 
AQ | Qo) 


We will now look for a canonical transformation reducing the hamiltonian 
H(p, q) to the form K(Q). To this end we will look for a generating function 
of such a transformation, S(Q, q). From (2) we obtain the condition 


as(Q,q) \_ 
(4) a(SO2. q, ) — K(Q, t) 


where after differentiation we must substitute q(P, Q) for q. We notice that 
for fixed Q, Equation (4) has the form of the Hamilton-Jacobi equation. 


Jacobi’s theorem. If a solution S(Q, q) is found to the Hamilton-Jacobi equa- 
tion (4), depending on n parameters®? Q; and such that det(6?S/0Qdq) # 0, 
then the canonical equations 

_ ott 

= 

can be solved explicitly by quadratures. The functions Q(p, q) determined 

by the equations 0S(Q, q)/6q = pare first integrals of the equation (5). 


H 
(5) oF Tag and 4q 


82 An n-parameter family of solutions of (4) is called a complete integral of the equation. 
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Proor. Consider the canonical transformation with generating function 
S(Q, q). By (2) we have p = (0S/6q)(Q, q), from which we can determine 
Q(p, q). We calculate the function H(p, q) in the new coordinates P, Q. 
We have H(p, q) = H((0S/0q)(Q, q), q). In order to find the hamiltonian 
function in the new coordinates we must substitute into this expression 
(after differentiation) for q its expression in terms of P and Q. However, 
by (4), this expression does not depend on P at all, so we have simply 


H(p, 4) = K(Q). 


Thus, in the new variables, Equation (5) has the form (3), from which Jacobi’s 
theorem follows directly. ma 


Jacobi’s theorem reduces solving the system of ordinary differential 
equations (5) to finding a complete integral of the partial differential equation 
(4). It may appear surprising that this “reduction” from the simple to the 
complicated provides an effective method for solving concrete problems. 
Nevertheless, it turns out that this is the most powerful method known for 
exact integration, and many problems which were solved by Jacobi cannot be 
solved by other methods. 


C Examples 


We consider the problem of attraction by two fixed centers. Interest in this 
problem has grown recently in connection with the study of the motion of 
artificial earth satellites. It is fairly clear that two close centers of attraction 
on the z-axis approximate attraction by an ellipsoid slightly extended along 
the z-axis. Unfortunately, the earth is not prolate, but oblate. To overcome 
this difficulty, one must place the centers at imaginary points at distances + ie 
from the origin along the z-axis. Analytic formulas for the solution are true, 
of course, in the complex region. In this way we obtain an approximation 
to the earth’s field of gravity, in which the equations of motion can be exactly 
integrated and which is closer to reality than the keplerian approximation 
in which the earth is a point. 

For simplicity we will consider only the planar problem of attraction by 
two fixed points with equal masses. The success of Jacobi’s method is based 
on the adoption of a suitable coordinate system, called elliptic coordinates. 
Suppose that the distance between the fixed points O, and O, is 2c (Figure 


Figure 204 Elliptic coordinates 
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Figure 205 Confocal ellipses and hyperbolas 


204), and that the distances of a moving mass from them are r, and r3, re- 
spectively. The elliptic coordinates €, y are defined as the sum and difference 
of the distances to the points O, and O,:€ =r, +7r.,n =1r, — 1p. 


PROBLEM. Express the hamiltonian function in elliptic coordinates. 
Solution. The lines € = const are ellipses with foci at O, and O,; the lines 7 = const are hyper- 
bolas with the same foci (Figure 205). They are mutually orthogonal; therefore, 


ds? = a? d&* + b? dn?. 


We will find the coefficients a and b. For motion along an ellipse we have dr, = ds cos « and 
dr, = —ds cos a, so dy = 2.cos ads. For motion along a hyperbola we have dr; = ds sin « 
and dr, = ds sin x, so dé = 2 sin x ds. Thus a = (2 sin x)! and b = (2 cos x)~!. Furthermore, 
from the triangle O, MO, we find r7 + r3 + 2r,rz cos 2x = 4c?, which implies 


5 eS 4c? — r? — 3 
cos* a — sin* a = ————__ 


2r rz 
2ryr 
; 2 
cos? x + sin? a = i 
Wrz 
a 2 2 2 
5 4c* — (ry — r) 2 (ry + 1r2)* — 4c 
cos* 4 = ——_______ sin? ¢ = ———_~_____. 
4r;r5 4r ir. 


But if ds? = Ya? dq?, then 


r=Yatt Sag. Poy ey 
i 3 Pi = iis ~ 2a? _ 


Thus, 


fog (ry +1)? — 4c? ae — (ry — 12)? 3 k ok 


2ryr, 2rirs ry 1 


Butr; +r. =& ry; —r2 =n, 4ryr, = ? — n?. Therefore, finally, 


E2 _ 4c? 
2° + 2p? >> - =5--5 4 


HO oe a? "2 yt G2 Lp 


We will now solve the Hamilton-Jacobi equation. 


Definition. If, in the equation 


aS OS. = 
NN 6g; ? dg? Fer eda =v, 
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the variable q, and derivative 0S/dq, appear only in the form of a combina- 
tion @(0S/éq,, q,), then we say that the variable q, is separable. 


In this case it is useful to look for a solution of the equation of the form 


S = S1(q1) + S'(q2; ar’ 


By setting @(0S,/0q,, q,) = c, in this equation, we obtain an equation for S’ 
with a smaller number of variables 


(= os’ 
*\dq2’”” Oqn 


Let S’ = S'(q2,---,4n3C1,¢) be a family of solutions to this equation 
depending on the parameters c;. The functions S,(q;,c,) + S’ will satisfy 
the desired equation if S, satisfies the ordinary differential equation 
(0S ,/0q;, 4;) = C1. This equation is easy to solve; we express 0S,/0q, 
in terms of q, and c, to obtain 0S,/éq, = W(q;,c,), from which S, = 
Jt Way. c1)dq1- 

If one of the variables, say q,, is separable in the new equation (with ®,) 
we can repeat this procedure and (in the most favorable case) we can find 
a solution of the original equation depending on n constants 


Si(4i3 C1) + S2(g23 cr, C2) +2 + SilGns Cts + Cn 


In this case we say that the variables are completely separable. 

If the variables are completely separable, then a solution depending on n 
parameters of the Hamilton-Jacobi equation, ®,(0S/dq, q) = 0, is found by 
quadratures. But then the corresponding system of canonical equations can 
also be integrated by quadratures (Jacobi’s theorem). 

We apply the above to the problem of two fixed centers. The Hamilton- 
Jacobi equation (4) has the form 


divs aies) = 


os 
(3) (7 — 4c?) + & a (4c? — n?) = K(G? — n?) + 4k6. 
We can separate variables by, for instance, setting 


(3) (22 = 4c?) — 4ht — KE = cy 


and 
és\? 
(5) (4c? =. n’) + Kr? = —-C,. 


Then we find the complete integral of Equation (4) in the form 


C 5 aes Cc Cc 
SEmene)d= f [A 26" teat | a 2 ay, 
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Jacobi’s theorem now gives an explicit expression, in terms of elliptic 
integrals, for motion in the problem of two fixed centers. A more detailed 
investigation of this motion can be found in Charlier’s book “ Die Mechanik 
des Himmels,” Berlin, Leipzig, W. de Gruyter & Co., 1927. 

Another application of the problem of the attraction of two fixed centers is 
the study of motion with fixed pull in a field with one attracting center. 

This is a question of the motion of a point mass under the action of a 
newtonian attraction of a fixed center and one more force (“pull”) of con- 
stant magnitude and direction. This problem can be looked at as the limiting 
case of the problem of attraction by two fixed centers. In the passage to 
the limit, one center goes off to infinity in the direction of the thrust force 
(during which its mass must grow proportionally to the square of the distance 
moved in order to guarantee constant pull). 

This limiting case of the problem of the attraction of two fixed centers 
can be integrated explicitly (in elliptic functions). We can convince ourselves 
of this by passing to a limit or by directly separating variables in the problem 
of motion with constant pull in a field with one center. The coordinates 
in which the variables are separated in this problem are obtained as the 
limit of elliptic coordinates as one of the centers approaches infinity. They 
are called parabolic coordinates and are given by the formulas 


u=r-x v=rt+x 


(the pull is directed along the x-axis). 

A description of the trajectories of a motion with constant pull (many 
of which are very intricate) can be found in V. V. Beletskii’s book “Sketches 
on the motion of celestial bodies,” Nauka, 1972. 

As one more example we consider the problem of geodesics on a triaxial 
ellipsoid. Here Jacobi’s elliptic coordinates 4,, 4,, and A, are helpful, where 
the 4; are the roots of the equation 


xf x3 Ooo 
a,+A agt+A asta 


1, A, >A, > Az3; 


X4, Xz, and x; are cartesian coordinates. We will not carry out the computa- 
tions showing that the variables are separable (they can be found, for example, 
in Jacobi’s “Lectures on dynamics”), but will mention only the result: we 
will describe the behavior of the geodesics. 

The surfaces 1, = const, 1, = const, and 4, = const are surfaces of 
second degree, called confocal quadrics. The first of these is an ellipsoid, the 
second a hyperboloid of one sheet, and the third a hyperboloid of two sheets. 
The ellipsoid can degenerate into the interior of an ellipse, the one-sheeted 
hyperboloid either into the exterior of an ellipse or into the part of a plane 


83 The problem of geodesics on an ellipsoid and the closely related problem of ellipsoidal 
billiards have found application in a series of recent results in physics connected with laser 
devices. 


264 


47: The Hamilton-Jacobi method 


between the branches of a hyperbola, and the two-sheeted hyperboloid 
either into the part of a plane outside the branches of a hyperbola or into a 
plane. 

Suppose that the ellipsoid under consideration is one of the ellipsoids 
in the family with semi-axes a > b > c. Each of the three ellipses x, = 0, 
X2 = 0, and x; = 0 is a closed geodesic. A geodesic starting from a point 
of the largest ellipse (with semiaxes a and b) in a direction close to the 
direction of the ellipse (Figure 206), is alternately tangent to the two closed 
lines of intersection of the ellipsoid with the one-sheeted hyperboloid of our 
family 4 = const.8* This geodesic is either closed or is dense in the area 


Figure 206 Geodesic on a triaxial ellipsoid 


Figure 207 Geodesics emanating from an umbilical point 


between the two lines of intersection. As the slope of the geodesic increases, 
the hyperboloids collapse down to the region “inside” the hyperbola which 
intersects our ellipsoid in its four “umbilical points.” In the limiting case 
we obtain geodesics passing through the umbilical points (Figure 207). 

It is interesting to note that all the geodesics starting at an umbilical 
point again converge at the opposite umbilical point, and all have the same 
length between the two umbilical points. Only one of these geodesics is closed, 
namely, the middle ellipse with semi-axes a and c. If we travel along any 
other geodesic passing through an umbilical point in any direction, we will 
approach this ellipse asymptotically. 

Finally, geodesics which intersect the largest ellipse even more “steeply” 
(Figure 208) are alternately tangent to the two lines of intersection of our 


84 These lines of intersection of the confocal surfaces are also lines of curvature of the ellipsoid. 
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Figure 208 Geodesics of an ellipsoid which are tangent to a two-sheeted hyperboloid 


ellipsoid with a two-sheeted hyperboloid.®* In general, they are dense in the 
region between these lines. The small ellipse with semi-axes b and c is among 
these geodesics. 

“The main difficulty in integrating a given differential equation lies in 
introducing convenient variables, which there is no rule for finding. There- 
fore, we must travel the reverse path and after finding some notable substitu- 
tion, look for problems to which it can be successfully applied.” (Jacobi, 
“Lectures on dynamics”). 

A list of problems admitting separation of variables in spherical, elliptic, 
and parabolic coordinates is given in Section 48 of Landau and Lifshitz’s 
“Mechanics” (Oxford, Pergamon, 1960). 


48 Generating functions 


In this paragraph we construct the apparatus of generating functions for non-free canonical 
transformations. 


A The generating function S,(P, q) 


Let f: R7" > R?" be a canonical transformation with g(p, q) = (P, Q). By 
the definition of canonical transformation the differential form on R2" 


p dq — PdQ=dS 


is the total differential of some function S(p, q). A canonical transformation is 
free if we can take q, Q as 2n independent coordinates. In this case the 
function S expressed in the coordinates q and Q is called a generating function 
S,(q, Q). Knowing this function alone, we can find all 2n functions giving the 
transformation from the relations 


_ 05,4, Q) 


(1) p= and P= — aa 


dQ 


It is far from the case that all canonical transformations are free. For 
example, in the case of the identity transformation q and Q = q are depen- 
dent. Therefore, the identity transformation cannot be given by a generating 


85 These are also lines of curvature. 
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function S,(q, Q). We can, however, obtain generating functions of another 
form by means of the Legendre transformation. Suppose, for instance, 
that we can take P, q as independent local coordinates on R?" (i.e., the 
determinant det(0(P, q)/0(p, q)) = det(OP/dp) is not zero). Then we have 


pdq—PdQ=dS and pdq+QdP=d(PQ+S). 


The quantity PQ + S, expressed in terms of (P, q), is also called a generating 
function 


S2(P, q) = PQ + S(p, q). 
For this function, we find 


_ 2S2(P, 4) 0S2(P, 4) 
aia 


(2) and Q= at ae 


Conversely, if S,(P, q) is any function for which the determinant 


d°S,(P, q) 
det ( dq oP 


Po; 4o 


is not zero, then in a neighborhood of the point 


as,(P, 
le 


we can solve the first group of equations (2) for P and obtain a function 
P(p, q) (where P(Ppo, Go) = Po). After this, the second group of equations (2) 
determine Q(p, q), and the map (p, q) > (P, Q) is canonical (prove this!). 


PROBLEM. Find a generating function S, for the identity map P = p, Q = q. 
ANSWER. Pq. 


Remark. The generating function $,(P,q) is convenient also because there are no minus 
signs in the formulas (2), and they are easy to remember if we remember that the generating 
function of the identity transformation is Pq. 


B 2" generating functions 


Unfortunately, the variables P, q cannot always be chosen for local co- 
ordinates either; however, we can always choose some set of n new co- 
ordinates 


P; = (P;,,..-, Pi) Q; = (Q;,,---, Oj,-,) 


so that together with the old q we obtain 2n independent coordinates. 
Here (i;,...,i)G1,---,Jn-4) iS any partition of the set (1,..., 7) into 
two non-intersecting parts; so there are in all 2” cases. 
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Theorem. Let g:R?"— R*" be a canonical transformation given by the 
functions P(p, q) and Q(p, q). In a neighborhood of every point (po, Qo) at 
least one of the 2” sets of functions (P;, Q;, q) can be taken as independent 
coordinates on R?": 


OP ;, Q,, q) = det OP ;, Q)) # 0. 


O(P:;,P;,4)  — Op;,P) 


In a neighborhood of such a point, the canonical transformation g can be 
reconstructed from the function 


det 


53(P;, Qj, 4) = (P,Q,) + | pdq—P dQ 


by the relations 


08s 


0S; 
= 3,’ and P, = —- — 


~ q’ i ~6Q; 

Conversely, if S3(P;,Q;,q) is any function for which the determinant 
det(d7S3/AP 6q)|p,,q, (P = P;, Q,) is not zero, then the relations (3) give a 
canonical transformation in a neighborhood of the point Po, Qo- 


(3) Q; 


ProoF. The proof of this theorem is almost the same as the one carried out 
above in the particular case k = n. We need only verify that the determinant 
det[(o(P;, Q;)/A(P;, P;))] is not zero for one of the 2".sets (P;, Q;, q). 


We consider the differential of our transformation g at the point (Po, qo). By identifying the 
tangent space to R?" with R?", we can consider dg as a symplectic transformation S : R2" > R2". 

Consider the coordinate p-plane P in R?” (Figure 209). This is a null n-plane, and its image SP 
is also a null plane. We project the plane SP onto the coordinate plane o = {(p;, q;)} parallel to 
the remaining coordinate axes, i.e., in the direction of the n-dimensional null coordinate plane 
& = {(p;, q;)}. We denote the projection operator by TS: P > a. 

The condition det(a(P;, Q;)/0(p;, P;)) # O means that T: SP — ois nonsingular. The operator 
S is nonsingular. Therefore, TS is nonsingular if and only if T: SP > o is nonsingular. In other 
words, the null plane SP must be transverse to the null coordinate plane ¢. But we showed in 


Figure 209 Checking non-degeneracy 
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Section 41 that at least one of the 2” null coordinate planes is transverse to SP. This means that 
one of our 2” determinants is nonzero, as was to be shown. oO 


PROBLEM. Show that this system of 2” types of generating functions is minimal: given any one of 
the 2" determinants, there exists a canonical transformation for which only this determinant is 
nonzero.®® 


C Infinitesimal canonical transformations 


We now consider a canonical transformation which is close to the identity. 
Its generating function can be taken close to the generating function Pq 
of the identity. We look at a family of canonical transformations g, depending 
differentiably on the parameter ¢, such that the generating functions have 
the form 
os os 
y, — P os = Tana. 
(4) Pq + eS(P,q;¢) Pp aaa Q q+es 
An infinitesimal canonical transformation is an equivalence class of families 
g., two families g, and h, being equivalent if their difference is small of higher 
than first order, |g, — h,| = O(e?), e > 0. 


Theorem. An infinitesimal canonical transformation satisfies Hamilton’s 


differential equations 


ap] __ oH dQ) _ oH 
dé |.=0 oq dé |,-9 OP 


with hamiltonian function H(p, q) = S(p, q, 9). 
ProorF. The result follows from formula (4): P > pase 0. | 
Corollary. A one-parameter group of transformations of phase space R?" 


satisfies Hamilton’s canonical equations if and only if the transformations 
are canonical. 


-p 
Figure 210 Geometric meaning of Hamilton’s function 


8° The number of kinds of generating functions in different textbooks ranges from 4 to 4”. 
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The hamiltonian function H is called the “generating function of the 
infinitesimal canonical transformation.” We notice that unlike the generating 
function S, the function H is a function of points of phase space, invariantly 
associated to the transformation. 

The function H has a simple geometric meaning. Let x and y be two points 
in R2" (Figure 210), y a curve connecting them, and dy = y — x. Consider 
the images of the curve y under the transformations g,, 0 <1 < «; they 
form a band o(e). Now consider the integral of the form w? = ¥ dp; 0 dq; 
over the 2-chain a, using the fact that do = g,y —y + 9,X — g;). 


PROBLEM. Show that 


tim 4 i ( oo? = H(x) — Hy) 


270 & 


exists and does not depend on the representative of the class g,. 
From this result we once more obtain the well-known 


Corollary. Under canonical transformations the canonical equations retain 
their form, with the same hamiltonian function. 


Proor. We computed the variation of the hamiltonian function using only 
an infinitesimal canonical transformation and the symplectic structure of 
R2"—the form w?. O 
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Perturbation theory consists of a very useful collection of methods for finding 
approximate solutions of “perturbed” problems which are close to com- 
pletely solvable “unperturbed” problems. These methods can be easily 
justified if we are investigating motion over a small interval of time. Relatively 
little is known about how far we can trust the conclusions of perturbation 
theory in investigating motion over large or infinite intervals of time. 

We will see that the motion in many “unperturbed” integrable problems. 
turns out to be conditionally periodic. In the study of unperturbed problems, 
and even more so in the study of the perturbed problems, special symplectic 
coordinates, called “action-angle” variables, are useful. In conclusion, we 
will prove a theorem justifying perturbation theory for single-frequency 
systems and will prove the adiabatic invariance of action variables in such 
systems. 


49 Integrable systems 


In order to integrate a system of 2n ordinary differential equations, we must know 2n first 
integrals. It turns out that if we are given a canonical system of differential equations, it is often 
sufficient to know only n first integrals—each of them allows us to reduce the order of the system 
not just by one, but by two. 


A Liouville’s theorem on integrable systems 


Recall that a function F is a first integral of a system with hamiltonian 
function H if and only if the Poisson bracket 


(H, F) =0 


is identically equal to zero. 
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Definition. Two functions F , and F, on asymplectic manifold are in involution 
if their Poisson bracket is equal to zero. 


Liouville proved that if, in a system with n degrees of freedom (i.e., with 
a 2n-dimensional phase space), n independent first integrals in involution 
are known, then the system is integrable by quadratures. 

Here is the exact formulation of this theorem: Suppose that we are given n 
functions in involution on a symplectic 2n-dimensional manifold 


Figsiey-F, (F;, Fj) = 9, i,j =1,2,...,n. 


Consider a level set of the functions F; 
M, = {x: FAx) = f,i = 1,..., n}. 


Assume that the n functions F; are independent on M, (ie., the n 1-forms 
dF, are linearly independent at each point of M;). Then 


1. M; isasmooth manifold, invariant under the phase flow with hamiltonian 
function H = F,. 

2. If the manifold M; is compact and connected, then it is diffeomorphic 
to the n-dimensional torus 


T’ = (1; 8 ”,)mod 2n}. 


3. The phase flow with hamiltonian function H determines a conditionally 
periodic motion on My, ie., in angular coordinates @ = (@,,..., @,) 
we have 


—_—= @ = @(f). 


4. The canonical equations with hamiltonian function H can be integrated 
by quadratures. 


Before proving this theorem, we note a few of its corollaries. 


Corollary 1. If, in a canonical system with two degrees of freedom, a first 
integral F is known which does not depend on the hamiltonian H, then the 
system is integrable by quadratures; a compact connected two-dimensional 
submanifold of the phase space H = h, F = f is an invariant torus, and 
motion on it is conditionally periodic. 


Proor. F and H are in involution since F is a first integral of a system with 
hamiltonian function H. O 


As an example with three degrees of freedom, we consider a heavy sym- 
metric Lagrange top fixed at a point on its axis. Three first integrals are 
immediately obvious: H, M,, and M3. It is easy to verify that the integrals 
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M, and M; are in involution. Furthermore, the manifold H = h in the phase 
space is compact. Therefore, we can immediately say, without any calcula- 
tions, that for the majority of initial conditions®’ the motion of the top is 
conditionally periodic: the phase trajectories fill up the three-dimensional 
torus H = c,, M, = cz, M; = c3. The corresponding three frequencies are 
called frequencies of fundamental rotation, precession, and nutation. 

Other examples arise from the following observation: if a canonical 
system can be integrated by the method of Hamilton-Jacobi, then it has n 
first integrals in involution. The method consists of a canonical transformation 
(p, q) > (P, Q) such that the Q;, are first integrals. But the functions Q; 
and Q, are clearly in involution. 

In particular, the observation above applies to the problem of attraction 
by two fixed centers. Other examples are easily found. In fact, the theorem 
of Liouville formulated above covers all the problems of dynamics which 
have been integrated to the present day. 


B Beginning of the proof of Liouville’s theorem 


We turn now to the proof of the theorem. Consider the level set of the 
integrals: 


M, = {x:F; = f, i =1,..., n}. 


By hypothesis, the n 1-forms dF; are linearly independent at each point of 
M,; therefore, by the implicit function theorem, M, is an n-dimensional 
submanifold of the 2n-dimensional phase space. 


Lemma 1. On the n-dimensional manifold M, there exist n tangent vector 
fields which commute with one another and which are linearly independent 
at every point. 


Proor. The symplectic structure of phase space defines an operator J taking 
1-forms to vector fields. This operator / carries the 1-form dF; to the field 
I dF, of phase velocities of the system with hamiltonian function F;. We 
will show that the n fields I dF; are tangent to M,, commute, and are inde- 
pendent. 

The independence of the J dF; at every point of M, follows from the inde- 
pendence of the dF; and the nonsingularity of the isomorphism I. The 
fields J dF; commute with one another, since the Poisson brackets of their 
hamiltonian functions (F;, F;) are identically 0. For the same reason, the 
derivative of the function F; in the direction of the field J dF ; is equal to zero 
for any i,j = 1,...,n. Thus the fields J dF; are tangent to M,, and Lemma 1 
is proved. O 


57 The singular level sets, where the integrals are not functionally independent, constitute the 
exception. 
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We notice that we have proved even more than Lemma 1: 


1’. The manifold M, is invariant with respect to each of the n commuting 
phase flows g} with hamiltonian functions F;: gig = gg}. 
1”. The manifold M, is null (i.e., the 2-form w? is zero on TM,|,). 


This is true since the n vectors I dF; |, are skew-orthogonal to one another 
((F;, F;) = 0) and form a basis of the tangent plane to the manifold M, at 
the point x. 


C Manifolds on which the action of the group 
R” is transitive 


We will now use the following topological proposition (the proof is completed 
in Section D). 


Lemma 2. Let M" be a compact connected differentiable n-dimensional mani- 
fold, on which we are given n pairwise commutative and linearly independent 
at each point vector fields. Then M" is diffeomorphic to an n-dimensional 
torus. 


Proor. We denote by gj, i = 1,...,n, the one-parameter groups of diffeo- 
morphisms of M corresponding to the n given vector fields. Since the fields 
commute, the groups gj and gj commute. Therefore, we can define an action g 
of the commutative group R" = {t} on the manifold M by setting 
g:M>M g = Gt Gis (t = (t),..., ty) € R"). 
Clearly, g'** = g'‘g’, t, s e R". Now fix a point x, € M. Then we have a map 
g:.R"3M g(t) = gtx. 


(The point x) moves along the trajectory of the first flow for time t,, along 
the second flow for time t,, etc.) 


PROBLEM 1. Show that the map g (Figure 211) of a sufficiently small neighborhood V of the 
point Oe R” gives a chart in a neighborhood of x9: every point x,9¢M has a neighborhood 
U (x9) € U c M) such that g maps V diffeomorphically onto U. 

Hint. Apply the implicit function theorem and use the linear independence of the fields at xg. 


PROBLEM 2. Show that g: R" > M is onto. 


Figure 211 Problem 1 
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Figure 212 Problem 2 


Hint. Connect a point x € M with xq by a curve (Figure 212), cover the curve by a finite 
number of the neighborhoods U of the preceding problem and define t as the sum of shifts t; 
corresponding to pieces of the curve. 


We note that the map g: R" > M" cannot be one-to-one since M" is 
compact and R” is not. We will examine the set of pre-images of xy € M". 


Definition. The stationary group of the point x is the set I of points t € R" 
for which g‘xg = Xo. 


PROBLEM 3. Show that I is a subgroup of the group R’, independent of the point xo. 

Solution. If g*xo = Xo and g'xo = Xo, then g***xy = g'g'xo = g*X9 = Xo and g ‘xy = 
g ‘g'Xo = Xo. Therefore, T is a subgroup of R". If x = g'xo and tel, then gtx = g'*"xy = 
9'G'Xo = g'Xo =X. 


In this way the stationary group I is a well-defined subgroup of R" 
independent of the point xg. In particular, the point t = 0 clearly belongs 
to I. 


PROBLEM 4. Show that, in a sufficiently small neighborhood V of the point 0 € R", there is no 
point of the stationary group other than t = 0. 
Hint. The map g: V — U is a diffeomorphism. 


PROBLEM 5. Show that, in the neighborhood t + V of any point te Tc R’, there is no point of 
the stationary group I other than t. (Figure 213) 


Thus the points of the stationary group I lie in R” discretely. Such sub- 
groups are called discrete subgroups. 


Figure 213. Problem 5 
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Figure 214 A discrete subgroup of the plane 


EXAMPLE. Let e;,..., e, be k linearly independent vectors in R",O < k <n. 
The set of all their integral linear combinations (Figure 214) 


me; +---+me, méZ =(..., -2, —1,0,1,...) 


forms a discrete subgroup of R”. For example, the set of all integral points 
in the plane is a discrete subgroup of the plane. 


D Discrete subgroups in R" 


We will now use the algebraic fact that the example above includes all discrete 
subgroups of R”. More precisely, we will prove 


Lemma 3. Let I be a discrete subgroup of R". Then there exist k (0 < k <n) 
linearly independent vectors e,;..., €,€ YT such that T is exactly the set of 
all their integral linear combinations. 


Proor. We will consider R” with some euclidean structure. We always 
have 0e€T. If F = {0} the lemma is proved. If not, there is a point eg ET, 
€) # 0 (Figure 215). Consider the line Rey. We will show that among the 
elements of I on this line, there is a point e, which is closest to 0. In fact, 
in the disk of radius |e, | with center 0, there are only a finite number of points 
of I (as we saw above, every point x of I has a neighborhood V of standard 
size which does not contain any other point of I). Among the finite number 
of points of I inside this disc and lying on the line Reg, the point closest to 0 
will be the closest point to 0 on the whole line. The integral multiples of this 
point e, (me,, me Z) constitute the intersection of the line Reg with I. 


Figure 215 Proof of the lemma on discrete subgroups 
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In fact, the points me, divide the line into pieces of length |e, |. If there were 
a point eeT inside one of these pieces (me,,(m + 1)e,), then the point 
e — me, €T would be closer to 0 than e,. 

If there are no points of I off the line Re,, the lemma is proved. Suppose 
there is a point e € T’,e ¢ Re,. We will show that there is a point e, € I closest 
to the line Re, (but not lying on the line). We project e orthogonally onto Re,. 
The projection lies in exactly one interval A= {Je,}, m<d<m+l1. 
Consider the right circular cylinder C with axis A and radius equal to the 
distance from A to e. In this cylinder lie a finite (nonempty) number of points 
of the group I. Let e, be the closest one to the axis Re, not lying on the axis. 


PROBLEM 6. Show that the distance from this axis to any point e of I not lying on Re, is greater 
than or equal to the distance of e, from Re. 
Hint. By a shift of me; we can move the projection of e onto the axis interval A. 


The integral linear combinations of e, and e, form a lattice in the plane 
Re, + Re. 


PROBLEM 7. Show that there are no points of F on the plane Re, + Re, other than integral 
linear combinations of e, and e,. 

Hint. Partition the plane into parallelograms (Figure 216) A= {A,e, + A,e5}, 
m; < A; < m; + 1. Ifthere were ane € Awithe 4 m,e, + m,e,, then the pointe — me, — m,e, 
would be closer to Re, thane,. 


e] 


Figure 216 Problem 7 


If there are no points of I outside the plane Re, + Re,, the lemma is 
proved. Suppose that there is a point e ET outside this plane. Then there exists 
a point e;¢I closest to Re, + Re,; the points m,e, + m,e, + m3e; 
exhaust I in the three-dimensional-space Re, + Re, + Re;. If T is not 
exhausted by these, we take the closest point to this three-dimensional 
space, etc. 


PROBLEM 8. Show that this closest point always exists. 
Hint. Take the closest of the finite number of points in a “cylinder” C. 


Note that the vectors e,, €,,€3,... are linearly independent. Since they all 
lie in R", there are k < n of them. 
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PROBLEM 9. Show that I is exhausted by the integral linear combinations of e,,..., e,. 

Hint. Partition the plane Re, + --- + Re, into parallelepipeds A and show that there cannot 
be a point of in any A. If there is an ee I outside the plane Re, + --- + Re,, the construction 
is not finished. 


Thus Lemma 3 is proved. O 


It is now easy to prove Lemma 2: M, is diffeomorphic to a torus T”. 
Consider the direct product of k circles and n — k straight lines: 


T* x RY = {(@q, .-.5 On3 Vio -++s Vn-e}s «= @ Mod 2z, 


together with the natural map p: R2" > T* x R"*, 


P(®, y) = (@ mod 27, y). 


The points f,,..., f, € R” (f; has coordinates g; = 27, y; = 0, y = 0) are 
mapped to 0 under this map. 

Let e,,...,e,€ IT < R" be the generators of the group TI (cf. Lemma 3). 
We map the vector space R" = {(q, y)} onto the space R” = {t} so that the 
vectors f; go to e;. Let A: R" > R" be such an isomorphism. 

We now note that R" = {(@, y)} gives charts for T* x R"~*, and R" = {t} 
gives charts for our manifold M,. 


PRoBLEM 10. Show that the map of charts A:R"—R" gives a diffeomorphism 
A:T* x R"-*§ 3 My, 


R" = {(@, y)} —4-> R" = {t} 


TE x IR" A M, 


But, since the manifold M, is compact by hypothesis, k = n and M, is an 
n-dimensional torus. Lemma 2 is proved. O 


In view of Lemma |, the first two statements of the theorem are proved. 
At the same time, we have constructed angular coordinates g,,..., @,,mod 22 
on M,. 


PROBLEM 11. Show that under the action of the phase flow with hamiltonian H the angular 
coordinates @ vary uniformly with time 


Q@, =; wo, = w(f) Q(t) = @(O) + wr. 


In other words, motion on the invariant torus M, is conditionally periodic. 
Hint.@ = A7't. 


Of all the assertions of the theorem, only the last remains to be proved: 
that the system can be integrated by quadratures. 
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50 Action-angle variables 


We show here that, under the hypotheses of Liouville’s theorem, we can find symplectic co- 
ordinates (I, @) such that the first integrals F depend only on I, and @ are angular coordinates 
on the torus M,. 


A Description of action-angle variables 


In Section 49 we studied one particular compact connected level manifold 
of the integrals: M, = {x: F(x) = f}; it turned out that M, was an n-di- 
mensional torus, invariant with respect to the phase flow. We chose angular 
coordinates @; on M so that the phase flow with hamiltonian function H = F, 
takes an especially simple form: 


P= olf) ott) = 90) + 


We will now look at a neighborhood of the n-dimensional manifold M; 
in 2n-dimensional phase space. 


PROBLEM. Show that the manifold M, has a neighborhood diffeomorphic to the direct product 
of the n-dimensional torus T” and the disc D” in n-dimensional euclidean space. 

Hint. Take the functions F; and the angles g; constructed above as coordinates. In view of 
the linear independence of the dF, the functions F; and 9; (i = 1,..., n) give a diffeomorphism 
of a neighborhood of M; onto the direct product T" x D”. 


In the coordinates (F, @) the phase flow with hamiltonian function H = F, 
can be written in the form of the simple system of 2n ordinary differential 
equations 


dF do 
(1) aoe He OF): 
which is easily integrated: F(t) = F(0), @(t) = @(0) + w(F(0))t. 

Thus, in order to integrate explicitly the original canonical system of 
differential equations, it is sufficient to find the variables @ in explicit form. 
It turns out that this can be done using only quadratures. A construction of 
the variables @ is given below. 

We note that the variables (F,@) are not, in general, symplectic co- 
ordinates. It turns out that there are functions of F, which we will denote 
by I = I(F), I = (,,...,1,), such that the variables (I, @) are symplectic 
coordinates: the original symplectic structure w? is expressed in them by 
the usual formula 


w? =) dl; A dg;. 
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The variables I are called action variables ;3* together with the angle variables 
@ they form the action-angle system of canonical coordinates in a neighbor- 
hood of M,. 

The quantities I; are first integrals of the system with hamiltonian function 
H = F,, since they are functions of the first integrals F ;. In turn, the variables 
F; can be expressed in terms of I and, in particular, H = F, = H(D). In 
action-angle variables the differential equations of our flow (1) have the form 


dl do 
eo 1? = wD. 
(2) Tt 0 Tt o(D), 


PROBLEM. Can the functions @(I) in (2) be arbitrary? 

Solution. In the variables (I, @), the equations of the flow (2) have the canonical form with 
hamiltonian function H(1). Therefore, o(1) = 0H/0I; thus if the number of degrees of freedom 
is n > 2, the functions @(I) are not arbitrary, but satisfy the symmetry condition dw,/0l; = 
6w,/6I;. 


Action-angle variables are especially important for perturbation theory; 
in Section 52 we will demonstrate their application to the theory of adiabatic 
invariants. 


B Construction of action-angle variables in the 
case of one degree of freedom 


A system with one degree of freedom in the phase plane (p, q) is given by the 
hamiltonian function H(p, q). 


EXAMPLE 1. The harmonic oscillator H = 3p? + 3q; or, more generally, 
H = 3a’p? + 5b7q’. 


EXAMPLE 2. The mathematical pendulum H = $p? — cos q. In both cases 
we have a compact closed curve M,(H = h), and the conditions of the 
theorem of Section 49 for n = 1 are satisfied. 


In order to construct the action-angle variables, we will look for a 
canonical transformation (p, q) > (I, ~) satisfying the two conditions: 


1. I = 1(h), 
(3) 
2. dp = 2n. 


Mn 
PROBLEM. Find the action-angle variables in the case of the simple harmonic oscillator 
H = 7p’ + 24°. 
Solution. If r, @ are polar coordinates, then dp A dq =r dr a do = d(r?/2) A dg. There- 
fore, 1 = H = (p* + q?)/2. 


88 It is not hard to see that I has the dimensions of action. 
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In order to construct the canonical transformation p,q — I, @ in the 
general case, we will look for its generating function S(/, q): 


_ 08g) _ Sg), (@S(L. a) 
O ga eee, (= 


We first assume that the function h(/) is known and invertible, so that every 
curve M, is determined by the value of I (M, = Mj). Then for a fixed 
value of I we have from (4) 


, a) = h(D). 


aS | 7= const =p dq. 


This relation determines a well-defined differential 1-form dS on the curve 
Integrating this 1-form on the curve M,,;) we obtain (in a neighborhood 
of a point qo) a function 


q 


sia) = | Pda 


q 
This function will be the generating function of the transformation (4) in 
a neighborhood of the point (J, qq). The first of the conditions (3) is satisfied 
automatically: I = I(h). To verify the second condition, we consider the 
behavior of S(/, q) “in the large.” After a circuit of the closed curve Mj; the 
integral of p dq increases by 


asi) = > pda, 
Mn) 
equal to the area IT enclosed by the curve M,,;. Therefore, the function S 
is a “multiple-valued function” on Mj ;): it is determined up to addition 
of integral multiples of II. This term has no effect on the derivative OSU, q)/0q; 
but it leads to the multi-valuedness of g = 0S/AI. This derivative turns out 
to be defined only up to multiples of d AS()/dI. More precisely, the formulas 
(4) define a 1-form dg on the curve My, and the integral of this form on 
Muy is equal to d AS(1)/dI. 
In order to fulfill the second condition, | 4, dp = 2n, we need that 


d As TI 
57 BSE) = 2 = = 


where II = § 4, p dq is the area bounded by the phase curve H = h. 


Definition. The action variable in the one-dimensional problem with 
hamiltonian function H(p, q) is the quantity I(h) = (1/27)II(h). 


Finally, we arrive at the following conclusion. Let dII/dh # 0. Then the 
inverse I(h) of the function A(I) is defined. 
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Theorem. Set SU, q) = \3, pdq|x=n- Then formulas (4) give a canonical 
transformation p, q > I, @ satisfying conditions (3). 


Thus, the action-angle variables in the one-dimensional case are con- 
structed. 


PROBLEM. Find S and J for a harmonic oscillator. 


Answer. If H = $a?p? + 4b7q? (Figure 217), then M, is the ellipse bounding the 


area II(h) = n(./2h/a)(./2h/b) = 2nh/ab = 2xh/w. Thus for a harmonic oscillator the action 
variable is the ratio of energy to frequency. The angle variable ¢ is, of course, the phase of 
oscillation. 


Figure 217 Action variable for a hamonic oscillator 


PROBLEM. Show that the period T of motion along the closed curve H = h on the phase plane 
P, q 1s equal to the derivative with respect to h of the area bounded by this curve: 


alI(h) 
T= ae 


Solution. In action-angle variables the equations of motion (2) give 


OH (xy : ey 7a 2F al 
= SS fe = 2nx|— =— = —, 
° ar \an \ah @ dh 


C Construction of action-angle variables in R2" 


We turn now to systems with n degrees of freedom given in R?” = {(p, q)} 
by a hamiltonian function H(p, q) and having n first integrals in involution 
F, = H, F2,..., F,. We will not repeat the reasoning which brought us to 
the choice of 2x1 = § p dq in the one-dimensional case, but will immediately 
define n action variables I. 

Let y,,...,, be a basis for the one-dimensional cycles on the torus My 
(the increase of the coordinate @; on the cycle ; is equal to 2x if i = j and 
0 if i 4 j). We set 


1 
(5) 1) == > pag 
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G 
Gy 
<n 


Figure 218 Independence of the curve of integration for the action variable 


PROBLEM. Show that this integral does not depend on the choice of the curve y; representing 
the cycle (Figure 218). 

Hint. In Section 49 we showed that the 2-form w? = ¥ dp; A dq, on the manifold M, is 
equal,to zero. By Stokes’ formula, 


p - pda = {{ ap» da =o, 
Y y a 


where Co = y — y’. 


Definition. The n quantities I,(f) given by formula (5) are called the action 
variables. 


We assume now that, for the given values f; of the n integrals F;, the n 
quantities IJ; are independent: det(@I/of)|, # 0. Then in a neighborhood 
of the torus M, we can take the variables I, @ as coordinates. 


Theorem. The transformation p, q — I, @ is canonical, i.e., 
Y, dp; A dq; = ¥. dl, 0 dg;. 
We outline the proof of this theorem. Consider the differential 1-form 
p dq on M,. Since the manifold My, is null (Section 49) this 1-form on M, 


is closed: its exterior derivative w? = dp A dq is identically equal to zero 
on M,. Therefore (Figure 219), 


S(x) = I P dq |v, 


Figure 219 Independence of the path for the integral of p dq on M, 
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does not change under deformations of the path of integration (Stokes’ 
formula). Thus S(x) is a “multiple-valued function” on M,, with periods 
equal to 


A;S — i dS — 2nI;. 
vi 


Now let xp be a point on Mg, in a neighborhood of which the n variables 
q are coordinates on My, such that the submanifold M, < R?" is given by n 
equations of the form p = p(I, q),q(Xo) = qo- Ina simply connected neighbor- 
hood of the point qy a single-valued function is defined, 


sda) = {pd a, 


and we can use it as the generating function of a canonical transformation 
Bqa>L@: 

_ os _ Os 

Prag OO Tr 

It is not difficult to verify that these formulas actually give a canonical 
transformation, not only in a neighborhood of the point under consideration, 
but also “in the large” in a neighborhood of M,. The coordinates @ will be 
multiple-valued with periods 


as_ 0 
‘61; al 


j 


A;S = sO ah = 2n6;;, 


-0,=A 
A;9; él; 


as was to be shown. O 


We now note that all our constructions involve only “algebraic” 
operations (inverting functions) and “quadrature”—calculation of the 
integrals of known functions. In this way the problem of integrating a 
canonical system with 2n equations, of which n first integrals in involution 
are known, is solved by quadratures, which proves the last assertion of 
Liouville’s theorem (Section 49). oO 


Remark J. Even in the one-dimensional case the action-angle variables 
are not uniquely defined by the conditions (3). We could have taken 
I'=1-+ const for the action variable and g’ = @ + c(J) for the angle 
variable. 


Remark 2. We constructed action-angle variables for systems with phase 
space R?". We could also have introduced action-angle variables for a system 
on an arbitrary symplectic manifold. We restrict outselves here to one simple 
example (Figure 220). 
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Figure 220 Action-angle variables on a symplectic manifold 


We could have taken the phase space of a pendulum (H = 4p* — cos q) 
to be, instead of the plane {(p,q)}, the surface of the cylinder R! x S! 
obtained by identifying angles q differing by an integral multiple of 27. 

The critical level lines H = +1 divide the cylinder into three parts, 
A, B, and C, each of which is diffeomorphic to the direct product R’ x S!. 
We can introduce action-angle variables into each part. In the bounded part 
(B) the closed trajectories represent the oscillation of the pendulum; in 
the unbounded parts they represent rotation. 


Remark 3. In the general case, as in the example analyzed above, the 
equations F; = f;cease to be independent for some values of f;, and M, ceases 
to be a manifold. Such critical values of f correspond to separatrices dividing 
the phase space of the integrable problem into parts corresponding to the 
parts A, B, and C above. In some of these parts the manifolds M, can be 
unbounded (parts A and C in the plane {(p, q)}); others are stratified into 
n-dimensional invariant tori M,; in a neighborhood of such a torus we 
can introduce action-angle variables. 


51 Averaging 


In this paragraph we show that time averages and space averages are equal for systems under- 
going conditionally periodic motion. 


A Conditionally periodic motion 


In the earlier sections of this book, we have frequently encountered condi- 
tionally periodic motion: Lissajous figures, precession, nutation, rotation of 
a top, ete. 


Definition. Let T" be the n-dimensional torus and @ = (,,..., @,) mod 22 
angular coordinates. Then by a conditionally periodic motion we mean a 
one-parameter group of diffeomorphisms T" — T” given by the dif- 
ferential equations (Figure 221): 


@ = O, @ = (@,,..., @,) = const. 
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~~; 


Figure 221 Conditionally periodic motion 


These differential equations are easily integrated: 

@(t) = (0) + or. 
Thus the trajectories in the chart {@} are straight lines. A trajectory on the 
torus is called a winding of the torus. 


ExamPLe. Let n = 2. If m,/2 = k,/k2, the trajectories are closed: if «,/q, is irrational, then 
trajectories on the torus are dense (cf. Section 16). 


The quantities w,,..., @, are called the frequencies of the conditionally 
periodic motion. The frequencies are called independent if they are linearly 
independent over the field of rational numbers: if k e Z"°° and (k, ) = 0, 
then k = 0. 


B Space average and time average 
Let f(@) be an integrable function on the torus T”. 


Definition. The space average of a function f on the torus T” is the number 


2n 


f=@n" [ | feoue, Redes 


Consider the value of the function f(@) on the trajectory @(t) = @p) + at. 
This is a function of time, f(@) + wt). We consider its average. 


Definition. The time average of the function f on the torus T” is the function 
icatilh, Pes 
F*00) = lim = | Flo + woe 
T? © T 0 
(defined where the limit exists). 
Theorem on the averages. The time average exists everywhere, and coincides 


with the space average if f is continuous (or merely Riemann integrable) 
and the frequencies w; are independent. 


BO Ks (Kagpeees k,) with integral k;. 
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PROBLEM. Show that if the frequencies are dependent, then the time average can differ from the 
space average. 


Corollary 1. If the frequencies are independent, then every trajectory {@(t)} 
is dense on the torus T". 


Proor. Assume the contrary. Then in some neighborhood D of some point 
of the torus, there is no point of the trajectory @(t). It is easy to construct a 
continuous function f equal to zero outside D and with space average equal 
to 1. The time average f*(@)) on the trajectory @(t) is equal to 0 $ 1. 
This contradicts the assertion of the theorem. O 


Corollary 2. If the frequencies are independent, then every trajectory is 
uniformly distributed on the torus T". 


This means that the time the trajectory spends in a neighborhood D is 
proportional to the measure of D. 

More precisely, let D be a (Jordan) measurable region of T”. We denote 
by t,(T) the amount of time that the interval 0 < t < T of the trajectory 
Q(t) is inside of D. Then 

fim t/(T) _ mes 2 
Ti AL (2n)” 


Proor. We apply the theorem to the characteristic function f of the set D 
(f is Riemann integrable since D is Jordan measurable). Then [§ f(@(t))dt = 
tp(T), and f = (2x)~" mes D, and the corollary follows immediately from 
the theorem. O 


Corollary. Jn the sequence 
1, 2, 4, 8, 1, 3,6, 1, 2,5, 1,2,... 
of first digits of the numbers 2", the number 7 appears (log 8 — log 7)/(log 9 — log 8) times as 


often as 8. 


The theorem on averages may be found implicitly in the work of Laplace, 
Lagrange, and Gauss on celestial mechanics; it is one of the first “ergodic 
theorems.” A rigorous proof was given only in 1909 by P. Bohl, W. Sierpinski, 
and H. Weyl in connection with a problem of Lagrange on the mean motion 
of the earth’s perihelion. Below we reproduce H. Weyl’s proof. 


C Proof of the theorem on averages 
Lemma 1. The theorem is true for exponentials f = e':®,k € Z". 


Proor. Ifk = 0, then f = f = f* = 1 and the theorem is obvious. Ifk # 0, 
then f = 0. On the other hand, 


T eifk,a)T _ 4 
{ eilk.eo Fort) de — ellk.oo) — 


0 i(k, @) 
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Therefore, the time average is 
gilk. Go) gilk, @)T _ 4 


lim —~— = 0. O 
ron (ko) T 


Lemma 2. The theorem is true for trigonometric polynomials 


f= ¥ fee. 
Ik|<N 
Proor. Both the time and space averages depend linearly on f, and therefore 
agree by Lemma 1. O 


Lemma 3. Let f be a real continuous (or at least Riemann integrable) function. 
Then, for any & > 0, there exist two trigonometric polynomials P, and P, 
such that P, < f < P, and (1/(2n)")\7(P2 — P,)dp < 


Proor. Suppose first that f is continuous. By the Weierstrass theorem, we 
can approximate f by a trigonometric polynomial P with | f — P| < 4e. 
The polynomials P,; = P — $¢and P, = P + $e are the ones we are looking 
for. 

If f is not continuous but Riemann integrable, then there are two continu- 
ous functions f, and f, such that f, < f < f, and (22)""{ (f2 — fi)d@ < 4 
(Figure 222 corresponds to the characteristic function of an interval). 
By approximating f, and f, by polynomials P, < f, < fp < Po, 
(2n)"" | (P2 — fo)d@ < 48, (22) -"{ (f, — P1)d@ < 32, we obtain what we 
need. Lemma 3 is proved. O 


Figure 222. Approximation of the function f by trigonometric polynomials P, and P, 
It is now easy to finish the proof of the theorem. Let ¢ > 0. Then, 
by Lemma 3, there are trigonometric polynomials P, < f < P, with 


(22)"" i) (P, Pare P,)d@ < & 
For any T, we then have 


1 T 1 T 1 T 
7 | Palotonar <= | slotonde < = | Pat@tonae 
By Lemma 2, for T > T)(¢), 


T 
Pi— 2 | Peotone 


<é (i = 1, 2). 
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Furthermore, P, <f< P, and P, — P, <«. Therefore, P, —f<e and 
f — P, < ¢; therefore, for T > To(e), 


T 
= [ fe@onde - i| < 26, 
T Jo 
as was to be proved. O 


PROBLEM. A two-dimensional oscillator with kinetic energy T = 4x? + 4j? and potential 
energy U = 3x” + y? performs an oscillation with amplitudes a, = 1 and a, = 1. Find the 
time average of the kinetic energy. 
PROBLEM.”° Let cw, be independent, a, > 0. Calculate 

3 


lim - arg Y a, e'°™. 


tom ft k=1 


ANSWER. (1%, + 24) + ©303)/™, where a,, %, and a3 are the angles of the triangle with 
sides a, (Figure 223). 


Qa) 


a} 


Figure 223 Problem on mean motion of perihelia 


D Degeneracies 


So far we have considered the case when the frequencies @ are independent. 
An integral vector ke Z” is called a relation among the frequencies if 


(k, @) = 0. 


PROBLEM. Show that the set of all relations between a given set of frequencies @ is a subgroup 
I of the lattice Z”. 


We saw in Section 49 that such a subgroup consists entirely of linear 
combinations of r independent vectors k;, 1 <r <n. We say that there are 
r (independent) relations among the frequencies.°! 


°° Lagrange showed that the investigation of the average motion of the perihelion of a planet 
reduces to a similar problem. The solution of this problem can be found in the work of H. Weyl. 
The eccentricity of the earth’s orbit varies as the modulus of an analogous sum. Ice ages appear 
to be related to these changes in eccentricity. 


°! Show that the number r does not depend on the choice of independent vectors k;. 
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PROBLEM. Show that the closure of a trajectory {@(t) = @) + wt} (on T") is a torus of dimen- 
sion n — r if there are r independent relations among the frequencies @; in this case the motion 
on T"~’ is conditionally periodic with n — r independent frequencies. 


We turn now to the integrable hamiltonian system given in action-angle 
variables I, @ by the equations 
: oH 
1=0 @ = (1D, where a(I) = aI" 
Every n-dimensional torus I = const in the 2n-dimensional phase space is 
invariant, and motion on it is conditionally periodic. 


Definition. A system is called nondegenerate if the determinant 


6w 0?H 
det ar = det a 


is not zero. 


PROBLEM. Show that, if a system is nondegenerate, then in any neighborhood of any point there 
is a conditionally periodic motion with n frequencies, and also with any smaller number of 
frequencies. 

Hint. We can take the frequencies w themselves instead of the variables I as local coordinates. 
In the space of collections of frequencies, the set of points @ with any number of relations 
r(0 <r <n) is dense. 


Corollary. If a system is nondegenerate, then the invariant tori 1 = const 
are uniquely defined, independent of the choice of action-angle coordinates 
I, g, the construction of which always involves some arbitrariness.?* 


Proor. The tori I = const can be defined as the closures of the phase tra- 
jectories corresponding to the independent o. O 


We note incidentally that, for the majority of values I, the frequencies 
@ will be independent. 


PROBLEM. Show that the set of I for which the frequencies m(I) in a nondegenerate system are 
dependent has Lebesgue measure equal to zero. 
Hint. Show first that 


mes {w: 3k # 0, (w, k) = 0} = 0. 


On the other hand, in degenerate systems we can construct systems of 
action-angle variables such that the tori I = const will be different in dif- 
ferent systems. This is the case because the closures of trajectories in a 
degenerate system are tori of dimension k < n, and they can be contained 
in different ways in n-dimensional tori. 


°2 For example, we can always write the substitution I’ = 1, @ =@ + S,(I), or J,, 1): 
Pi, M2 21, + 12,12. Pi, G2 — 1. 
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EXAMPLE 1. The planar harmonic oscillator X = —x;n = 2, k = 1. Separa- 
tion of variables in cartesian and polar coordinates leads to different action- 
angle variables and different tori. 


EXAMPLE 2. Keplerian planar motion (U = —1/r), n= 2, k = 1. Here, 
too, separation of variables in polar and in elliptic coordinates leads to 
different I. 


52 Averaging of perturbations 


Here we show the adiabatic invariance of the action variable in a system with one degree of 
freedom. 


A Systems close to integrable ones 


We have considered a great many integrable systems (one-dimensional 
problems, the two-body problem, small oscillations, the Euler and Lagrange 
cases of the motion of a rigid body with a fixed point, etc.). We studied the 
characteristics of phase trajectories in these systems: they turned out to be 
“windings of tori,” densely filling up the invariant tori in phase space; every 
trajectory is uniformly distributed on this torus. 

One should not conclude from this that integrability is the typical 
situation. Actually, the properties of trajectories in many-dimensional 
systems can be highly diverse and not at all similar to the properties of 
conditionally periodic motions. In particular, the closure of a trajectory 
of a system with n degrees of freedom can fill up complicated sets of dimension 
greater than n in 2n-dimensional phase space; a trajectory could even be 
dense and uniformly distributed on a whole (2n — 1)-dimensional manifold 
given by the equation H = h.°? One may call such systems “nonintegrable” 
since they do not admit single-valued first integrals independent of H. 
The study of such systems is still far from complete; it constitutes a problem 
in “ergodic theory.” 

One approach to nonintegrable systems is to study systems which are 
close to integrable ones. For example, the problem of the motion of planets 
around the sun is close to the integrable problem of the motion of non- 
interacting points around a stationary center; other examples are the prob- 
lem of the motion of a slightly asymmetric heavy top and the problem of 
nonlinear oscillations close to an equilibrium position (the nearby integrable 
problem is linear). The following method is especially fruitful in the in- 
vestigation of these and similar problems. 


B The averaging principle 
Let I, @ be action-angle variables in an integrable (“unperturbed”) system 


with hamiltonian function H,(I): 


oH 
i=0 =a a(l) =P. 


°3 For example, inertial motion on a manifold of negative curvature has this property. 
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As the nearby “perturbed” system we take the system 
(1) o=o(1) + (lg) T= egil 9), 


where € < 1. 

We will ignore for a while that the system is hamiltonian and consider 
an arbitrary system of differential equations in the form (1) given on the direct 
product T* x G of the k-dimensional torus T* = {@ = (@,, ..., Y,) mod 27} 
and a region G in [-dimensional space G c R' = {I = (,,..., 1}. For 
é = 0 the motion in (1) is conditionally periodic with at most k frequencies 
and with k-dimensional invariant tori. 

The averaging principle for system (1) consists of its replacement by 
another system, called the averaged system: 


F 2n 2n 
(2) d=) 20) =n" ‘ i [ a5, od9,,....d0, 


in the /-dimensional region G c R' = {J = (J,,..., Jp}. 

We claim that system (2) is a “good approximation” to system (1). 

We note that this principle is neither a theorem, an axiom, nor a definition, 
but rather a physical proposition, i.e., a vaguely formulated and, strictly 
speaking, untrue assertion. Such assertions are often fruitful sources of 
mathematical theorems. 

This averaging principle may be found explicitly in the work of Gauss 
(in studying the perturbations of planets on one another, Gauss proposed 
to distribute the mass of each planet around its orbit proportionally to time 
and to replace the attraction of each planet by the attraction of the ring so 
obtained). Nevertheless, a satisfactory description of the connection between 
the solutions of systems (1) and (2) in the general case has not yet been found. 

In replacing system (1) by system (2) we discard the term eg(I, @) = 
eg(1, @) — eg(I) on the right-hand side. This term has order ¢ as does the 
remaining term eg. In order to understand the different roles of the terms 
@ and g in g, we consider the simplest example. 


PROBLEM. Consider the case k = | = 1, 
g=0#0 I= eg(9). 
Show that for 0 < t < 1/e, 
|I(t) — J(t)| < ce, where J(t) = 1(0) + egt. 


Solution 


t 


I(t) — 1(0) = i 


t ot 
&g(Po + ot)dt = [a dt + = [ j(o)dp = sgt + ka h(wt) 
0 0 O Jo wo 


where h(~) = [8 §(@)d@ is a periodic, and therefore bounded, function. 
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I(t) J(t) 


Figure 224 Evolution and oscillation 


Thus the variation in J with time consists of two parts: an oscillation of 
order ¢ depending on g and a systematic “evolution” with velocity 
(Figure 224). 

The averaging principle is based on the assertion that in the general 
case the motion of system (1) can be divided into the “evolution” (2) and 
small oscillations. In its general form, this assertion is invalid and the principle 
itself is untrue. Nevertheless, we will apply the principle to the hamiltonian 


system (1): 
to) ; 6 
Q=- al (A.D + ¢H,0,@)) T= Gp od) + €H,(I, @)). 


For the right-hand side of the averaged system (2) we then obtain 
, 2n é 1 
& = (27) 5. 66 A,(L, dg = 0. 
In other words, there is no evolution in a nondegenerate hamiltonian system. 

One variant of this entirely nonrigorous deduction leads to the so- 
called Laplace theorem: The semi-major axes of the keplerian ellipses of 
the planets have no secular perturbations. 

The discussion above suffices to convince us of the importance of the 
averaging principle; we now formulate a theorem justifying this principle 
in one very particular case—that of single-frequency oscillations (k = 1). 
This theorem shows that the averaging principle correctly describes evolution 
over a large interval of time (0 < t < 1/e). 


C Averaging in a single-frequency system 
Consider the system of / + 1 differential equations 


(1) ) = o(I) + ef (I, a Q mod 27 € S}, 


I = eg(I, ¢) leGcR 

where f(I, 9 + 22) = f(I, g) and g(I, + 2z) = gil, o), together with the 
“averaged” system of | equations 

2n 


: 1 
(2) J = eg(J), where g(J) = on i a(J, p)dg. 


0 
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G-d 


Figure 225 Theorem on averaging 


We denote by I(t), g(t) the solution of system (1) with initial conditions 
1(0), @(0), and by J(t) the solution of system (2) with the same initial con- 
ditions J(0) = 1(0) (Figure 225). 

Theorem. Suppose that: 


1. the functions w, f, and g are defined for I in a bounded region G, and in 
this region they are bounded, together with their derivatives up to second 
order: 

lo, f, Bllcxgxst) < C15 


2. in the region G, we have 
a(D > c > 0; 
3. forO < t < 1/e, a neighborhood of radius d of the point J(t) belongs to G: 
J“eG—d. 
Then for sufficiently small ¢ (0 < & < &) 


— 


I(t) — J()| < coe, forallt,O<t< ~ 


where the constant Cy > 0 depends on c,, c, and d, but not on é. 


Some applications of this theorem will be given below (“adiabatic in- 
variants”). We remark that the basic idea of the proof of this theorem 
(a change of variables diminishing the perturbation) is more important than 
the theorem itself; this is one of the basic ideas in the theory of ordinary 
differential equations; it is encountered in elementary courses as the “method 
of variation of constants.” 


D Proof of the theorem on averaging 
In place of the variables I we will introduce new variables P 


(3) P=I1+ ¢k(L, 9), 


where the function k, 2-periodic in @, will be chosen so that the vector P 
will satisfy a simpler differential equation. 
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By (1) and (3), the rate of change of P(t) is 


ck. aL ek 2 Ok 
=I se | —o= pala cr 
(4) +6 aI +6é ae” = let g) + od] +e 31 gt é ao 
We assume that the substitution (3) can be ae so that 
(5) I=P + «h(P, 9, €) 


(where the functions h are 27-periodic in @). 
Then (4) and (5) imply that P(t) satisfies the system of equations 


(6) = b= ae, oD Reerae Fe oP) | + R 

where the “remainder term” R is small of second order with respect to «: 
(7) |R| < cz@’, C(C1, C3, C4) > 9, 

if only 


(8) llalle2 <1 Wf llez< er Wgllez< ey WKlle2 <3 [Ihllc2 < ¢4. 


We will now try to choose the change of variables (3) so that the term 
involving é in (6) becomes zero. For k we get the equation 
ck 
é0.——«i ct 
In general, such an equation is not solvable in the class of functions k 
periodic in g. In fact, the average value (with respect to @) of the left-hand side 
is always equal to 0, and the average value of the right-hand side can be 
different from 0. Therefore, we cannot choose k in such a way as to kill the 
entire term involving ¢ in (6). However, we can kill the entire “periodic” 
part of g, 


a(P, ») = a(P, ») — a(P), 
by setting 


* g(P, ®) 4 
cP) 4 


So we define the function k by formula (9). Then, by hypotheses 1. and 
2. of the theorem, the function k satisfies the estimate ||k||~2<c3, where 
c3(c,,c) > 0. In order to establish the inequality (8), we must estimate h. 
For this we must first show that the substitution (3) is invertible. 

Fix a positive number «. 


(9) K(P, 9) = - 


Lemma. If ¢ is sufficiently small, then the restriction of the mapping (3)?* 
I>I1+c¢k, where |kK|ca@) < c3, 


°4 For any fixed value of the parameter g. 
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to the region G — «(consisting of points whose «-neighborhood is contained 
in G) is a diffeomorphism. The inverse diffeomorphism (5) in the region 
G — 2a satisfies the estimate ||h||~2 < c4 with some constant c4(a, c3) > 0. 


Proor. The necessary estimate follows directly from the implicit function 
theorem. The only difficulty is in verifying that the map I> I + ck is one- 
to-one in the region G — «. We note that the function k satisfies a Lipschitz 
condition (with some constant L(«,c3)) in G — a. Consider two points 
I,, I, in G — «. For sufficiently small ¢ (namely, for Le < 1) the distance 
between ck(I,) and ck(I,) will be smaller than |I, —I,|. Therefore, 
I, + ck) # I, + «k(1,). Thus the map (3) is one-to-one on G — a, and 
the lemma is proved. O 


It follows from the lemma that for ¢ small enough all the estimates (8) 
are satisfied. Thus the estimate (7) is also true. 
We now compare the system of differential equations for J 


(2) J = (J) 
and for P; the latter, in view of (9), takes the form 
(6’) P= eg(P) + R. 


Since the difference between the right sides is of order < ¢? (cf. (7)), for time 
t S 1/e the difference |P — J| between the solutions is of order ¢ (Figure 226). 
On the other hand, |I — P| = e|k| < «. Thus, for t < 1/e, the difference 
|I — J| is of order Se, as was to be proved. O 


T(t), PO 


J(t) 


Figure 226 Proof of the theorem on averaging 


To find an accurate estimate, we introduce the quantity 
(10) z(t) = P(t) — J(t). 
Then (6’) and (9) imply 
z= @(P) ~ AD) + R= 0 Er +R, 


where |R’| < c,e? + cse|z| if the segment (P, J) lies in G — « Under this assumption we find 
(11) |Z] < cge|z| + ce? (where cg = cs + ¢)) 


|z(0)| < c36. 
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Lemma. If |z| < a|z| + b and |z(0)| < d for a, b, d, t > 0, then |z(t)| < (d + bt)e”. 
ProoF. |z(t)| is no greater than the solution y(t) of the equation | = ay + b, y(0) = d. Solving 
this equation, we find y = Ce”, Ce" = b, C = e~“b, C(0) = d,C < d + bt. oO 
Now from (11) and the assumption that the segment (P, J) lies in G — a (Figure 226), we have 
|z(t)| < (c3e + C2 ete". 
From this it follows that, for0 <t < I/e, 
|z(t)| < c7é cz = (c3 + cp )e%. 


We see that, if « = d/3 and ¢ is small enough, the entire segment (P(t), J(t))(t < 1/e) lies inside 
G — «and, therefore, 


1 
|P(t) — JO| <cge forallO<t<-. 
g 


On the other hand, | P(t) — I(t)| < |ek| < c3¢. Thus, for all t with0O <1 < I/e, 


[K(t) — I(t) < coe Cg = Cg + Cy > 0 


and the theorem is proved. O 


E Adiabatic invariants 


Consider a hamiltonian system with one degree of freedom, with hamiltonian 
function H(p, q; 4) depending on a parameter 4. As an example, we can take 
a pendulum: 


2 qQ? 

H= a + Ig > 7 
as the parameter A we can take the length / or the acceleration of gravity g. 
Suppose that the parameter changes slowly with time. It turns out that in 
the limit as the rate of change of the parameter approaches 0, there is a 
remarkable asymptotic phenomenon: two quantities, generally independent, 
become functions of one another. 

Assume, for example, that the length of the pendulum changes slowly 
(in comparison with its characteristic oscillations). Then the amplitude 
of its oscillation becomes a function of the length of the pendulum. If we 
very slowly increase by a factor of two the length of the pendulum and then 
very slowly decrease it to the original value, then at the end of this process 
the amplitude of the oscillation will be the same as it was at the start. 

Furthermore, it turns out that the ratio of the energy H of the pendulum 
to the frequency w changes very little under a slow change of the parameter, 
although the energy and frequency themselves may change a lot. Quantities 
such as this ratio, which change little under slow changes of parameter, 
are called by physicists adiabatic invariants. 

It is easy to see that the adiabatic invariance of the ratio of the energy 
of a pendulum to its frequency is an assertion of a physical character, i.e., it is 
untrue without further assumptions. In fact, if we vary the length of a 
pendulum arbitrarily slowly, but chose the phase of oscillation under which 


Pp 
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V/L7772) 7 
{ 
> 


Figure 227 Adiabatic change in the length of a pendulum 


the length increases and decreases, we can set the pendulum swinging 
(parametric resonance). In view of this, physicists have suggested formulating 
the definition of adiabatic invariance as follows: the person changing the 
parameters of the system must not see what state the system is in (Figure 227). 
Giving this definition a rigorous mathematical meaning is a very delicate 
and as yet unsolved problem. Fortunately, we can get along with a surrogate. 
The assumption of ignorance of the internal state of the system on the part 
of the person controlling the parameter may be replaced by the requirement 
that the change of parameter must be smooth, ie., twice continuously 
differentiable. 

More precisely, let H(p, q; 4) be a fixed, twice continuously differentiable 
function of A. Set A = et and consider the resulting system with slowly 
varying parameter A = et: 


* = — — = = < 
(*) p 24 q ap’ H = H(p, q; &t). 


Definition. The quantity I(p, q; 4) is an adiabatic invariant of the system (*) 
if for every x > 0 there is an &) > 0 such that if0 < ¢ < ¢) and 0 <t < 1/e, 
then 


|I(p(t), a(t); et) — I(pO), q(0); 0)| < x. 


Clearly, every first integral is also an adiabatic invariant. It turns out that 
every one-dimensional system (*) has an adiabatic invariant. Namely, the 
adiabatic invariant is the action variable in the corresponding problem 
with constant coefficients. 

Assume that the phase trajectories of the system with hamiltonian 
H(p, q; A) are closed. We define a function I(p, q; 4) in the following way. 
For fixed A there is a phase portrait corresponding to the hamiltonian function 
H(p, q; A) (Figure 228). Consider the closed phase trajectory passing through 
a point (p, q). It bounds some region in the phase plane. We denote the area 
of this region by 2zI(p, q; 4). I = const on every phase trajectory (for 
given A). Clearly, / is nothing but the action variable (cf. Section 50). 


Theorem. If the frequency o(I, A) of the system (*) is nowhere zero, then 
I(p, q; A) is an adiabatic invariant. 
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p 


Xd fixed 


Figure 228 Adiabatic invariant of a one-dimensional system 


F Proof of the adiabatic invariance of action 


For fixed A we can introduce action-angle variables I, g into the system (*) 
by a canonical transformation depending on A: p,q 1,9; @ = (I, A), 
I = 0; (I, A) = 6H,/él, Ho = HoU, A). 

We denote by S(J, q; 4) the (multiple-valued) generating function of this 
transformation: 


_ Ss _ Os 

on 6q wane a 
Now let A = et. Since the change from variables p, q to variables I, @ is now 
performed by a time dependent canonical transformation, the equations of 


motion in the new variables I, g@ have the hamiltonian form, but with 
hamiltonian function (cf. Section 45A) 
os 


0 
= Hy + eS. 


PRoBLeEM. Show that OS(I, q; 4)/0A is a single-valued function on the phase plane. 
Hint. S is determined up to the addition of multiples of 2z/. 


In this way we obtain the equations of motion in the form 


b = aI, A) + &f (I, 934) feo 
Q= > ef +P; ~ OF aA’ 
; aS 
= I, sa ——— 
eg(I, p; A) = a5 aR 
hae 


Since w # 0, the averaging theorem (Section 52C) is applicable. The 
averaged system has the form 
J= &g A=e. 
But g = (6/6) (0S/0A), and OS/dA is a single-valued function on the circle 
I = const. Therefore, § = (2x) ' { g dp = 0, and in the averaged system J 
does not change at all: J(t) = J(0). 
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By the averaging theorem, |I(t) — I(0)| < ce for all t with 0 <t < 1/e, 
as was to be proved. | 


EXAMPLE. For a harmonic oscillator (cf. Figure 217), 


2 2: 
a ce ee ee Le 
H=7?P +754 eae caer pe oe @ = ab, 


i.e., the ratio of energy to frequency is an adiabatic invariant. 


U 


Figure 229 Adiabatic invariant of an absolutely elastic ball between slowly changing 
walls 


PROBLEM. The length of a pendulum is slowly doubled (J = /)(1 + et), 
0 <t < 1/e). How does the amplitude q,,,, of the oscillations vary? 


Solution. I = 41°/g'/?q2,,,3 therefore, 


1(0) 3/4 

Qmax(t) aad Amax(0) a) £ 
As a second example, consider the motion of a perfectly elastic rigid ball 
of mass 1 between perfectly elastic walls whose separation | slowly varies 
(Figure 229). We may consider that a point is moving in an “infinitely deep 
rectangular potential well,” and that the phase trajectories are rectangles 
of area 2vl, where v is the velocity of the ball. In this case the product vl 
of the velocity of the ball and the distance between the walls turns out to be 
an adiabatic invariant.2> Thus if we make the walls twice as close together, 
the velocity of the ball doubles, and if we separate the walls, the velocity 

decreases. 


°5 This does not formally follow from the theorem, since the theorem concerns smooth systems 


without shocks. The proof of the adiabatic invariance of vl in this system is an instructive elemen- 
tary problem. 
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From a sheet of paper, one can form a cone or a cylinder, but it is impossible 
to obtain a piece of a sphere without folding, stretching, or cutting. The reason 
lies in the difference between the “intrinsic geometries” of these surfaces: no 
part of the sphere can be isometrically mapped onto the plane. 

The invariant which distinguishes riemannian metrics is called riemannian 
curvature. The riemannian curvature of a plane is zero, and the curvature of 
a sphere of radius R is equal to R~?. If one riemannian manifold can be iso- 
metrically mapped to another, then the riemannian curvature at correspond- 
ing points is the same. For example, since a cone or cylinder is locally iso- 
metric to the plane, the riemannian curvature of the cone or cylinder at any 
point is equal to zero. Therefore, no region ofa cone or cylinder can be mapped 
isometrically to a sphere. 

The riemannian curvature of a manifold has a very important influence 
on the behavior of geodesics on it, i.e, on motion in the corresponding 
dynamical system. If the riemannian curvature of a manifold is positive (as 
on a sphere or ellipsoid), then nearby geodesics oscillate about one another 
in most cases, and if the curvature is negative (as on the surface of a hyper- 
boloid of one sheet), geodesics rapidly diverge from one another. 

In this appendix we define.riemannian curvature and briefly discuss the 
properties of geodesics on manifolds of negative curvature. A further treat- 
ment of riemannian curvature can be found in the book, “Morse Theory” 
by John Milnor, Princeton University Press, 1963, and a treatment of 
geodesics on manifolds of negative curvature in D. V. Anosov’s book, 
“Geodesic flows on closed riemannian manifolds with negative curvature,” 
Proceedings of the Steklov Institute of Mathematics, No. 90 (1967), Am. 
Math. Soc., 1969. 


A Parailel translation on surfaces 


The definition of riemannian curvature is based on the construction of parallel 
translation of vectors along curves on a riemannian manifold. 

We begin with the case when the given riemannian manifold is two- 
dimensional, i.e., a surface, and the given curve is a geodesic on this surface. 
[See do Carmo, Manfredo Perdigao, “Differential Geometry of Curves and 
Surfaces,” Prentice-Hall, 1976. (Translator’s note) ] 

Parallel translation of a vector tangent to the surface along a geodesic on 
this surface is defined as follows: the point of origin of the vector moves along 
the geodesic, and the vector itself moves continuously so that its angle with 
the geodesic and its length remain constant. By translating to the endpoint 
of the geodesic all vectors tangent to the surface at the initial point, we obtain 
a map from the tangent plane at the initial point to the tangent plane at the 
endpoint. This map is linear and isometric. 

We now define parallel translation of a vector on a surface along a broken 
line consisting of several geodesic arcs (Figure 230). In order to translate a 
vector along a broken line, we translate it from the first vertex to the second 
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Figure 230 Parallel translation along a broken geodesic 


along the first geodesic arc, then translate this vector along the second arc 
to the next vertex, etc. 


PROBLEM. Given a vector tangent to the sphere at one vertex of a spherical triangle with three 
right angles, translate this vector around the triangle and back to the same vertex. 


ANSweR. As a result of this translation the tangent plane to the sphere at the initial vertex will 
be turned by a right angle. 


Finally, parallel translation of a vector along any smooth curve on a surface 
is defined by a limiting procedure, in which the curve is approximated by 
broken lines consisting of geodesic arcs. 


PROBLEM. Translate a vector directed towards the North Pole and located at Leningrad (latitude 
A = 60°) around the 60th parallel and back to Leningrad, moving to the east. 


ANSWER. The vector turns through the angle 2z (1 — sin A), i.c., approximately 50° to the west. 
Thus the size of the angle of rotation is proportional to the area bounded by our parallel, and 
the direction of rotation coincides with the direction the origin of the vector is going around the 
North Pole. 

Hint. It is sufficient to translate the vector along the same circle on the cone formed by the 
tangent lines to the meridian, going through all the points of the parallel (Figure 231). This cone 
then can be unrolled onto the plane, after which parallel translation on its surface becomes 
ordinary parallel translation on the plane. 


Figure 231 Parallel translation on the sphere 
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EXAMPLE. We consider the upper half-plane y > 0 of the plane of complex numbers z = x + iy 
with the metric 


_ dx? + dy? 


y? 


ds? 


It is easy to compute that the geodesics of this two-dimensional riemannian manifold are circles 
and straight lines perpendicular to the x-axis. Linear fractional transformations with real 
coefficients 

az+b 


cz +d 


a 


are isometric transformations of our manifold, which is called the Lobachevsky plane. 


PROBLEM. Translate a vector directed along the imaginary axis at the point z = i to the point 
z =t + ialong the horizontal line (dy = 0) (Figure 232). 


Answer. Under translation by t the vector turns t radians in the direction from the y-axis towards 
the x-axis. 


> 


Figure 232 Parallel translation on the Lobachevsky plane 


B The curvature form 


We will now define the riemannian curvature at each point of a two-dimen- 
sional riemannian manifold (i.e., a surface). For this purpose, we choose an 
orientation of our surface in a neighborhood of the point under consideration 
and consider parallel translation of vectors along the boundary of a small 
region D on our surface. It is easy to calculate that the result of such a trans- 
lation is rotation by a small angle. We denote this angle by @(D) (the sign of the 
angle is fixed by the choice of orientation of the surface). 

If we divide the region D into two parts D, and D,, the result of parallel 
translation along the boundary of D can be obtained by first going around 
one part, and then the other. Thus, 


p(D) = p(D,) + e(Dz), 


i.e., the angle ¢ is an additive function of regions. When we change the direc- 
tion of travel along the boundary, the angle @ changes sign. It is natural 
therefore to represent ~(D) as the integral over D of a suitable 2-form. Such 
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a 2-form in fact exists; it is called the curvature form, and we denote it by Q. 
Thus we define the curvature form Q by the relation 


(1) y(D) = fo 
D 


The value of Q on a pair of tangent vectors €, 1 in TM,, can be defined in the 
following way. We identify a neighborhood of the point 0 in the tangent space 
to M at x with a neighborhood of the point x on M (using, for example, 
some local coordinate system). We can then construct on M the parallelogram 
II, spanned by the vectors ¢e€, én, at least for sufficiently small «. 

Now the value of the curvature form on our vectors is defined by the 
formula 


(2) Og, 1)= tim 2. 
e>0 é 


In other words, the value of the curvature form on a pair of tangent vectors 
is equal to the angle of rotation under translation along the infinitely small 
parallelogram determined by these vectors. 


PROBLEM. Find the curvature forms on the plane, on a sphere of radius R, and on the Lobachevsky 
plane. 


ANSWER. Q = 0, Q = R~? dS, Q = —dS, where the 2-form dS is the area element on our 
oriented surface. 


PRoBLEM. Show that the function defined by formula (2) is really a differential 2-form, independent 
of the arbitrary choice involved in the construction, and that the rotation of a vector under 
translation along the boundary of a finite oriented region D is expressed, in terms of this form, 
by formula (1). 


PRosLem. Show that the integral of the curvature form over any convex surface in three-dimen- 
sional euclidean space is equal to 4z. 


C The riemannian curvature of a surface 


We note that every differential 2-form on a two-dimensional oriented 
riemannian manifold M can be written in the form pdS, where dS is the 
oriented area element and p is a scalar function uniquely determined by the 
choice of metric and orientation. 

In particular, the curvature form can be written in the form 


Q = KdS, 


where K: M > R is a smooth function on M and dS is the area element. 
The value of the function K at a point x is called the riemannian curvature 
of the surface at x. 


PRosLEM. Calculate the riemannian curvature of the euclidean space, the sphere of radius R, 
and the Lobachevsky plane. 
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Answer. K = 0, K = R~?, K = —1. 


PROBLEM. Show that the riemannian curvature does not depend on the orientation of the mani- 
fold, but only on its metric. 
Hint. The 2-forms Q and dS both change sign under a change of orientation. 


PROBLEM. Show that, for surfaces in ordinary three-dimensional euclidean space, the riemannian 
curvature at every point is equal to the product of the inverses of the principal radii of curvature 
(with minus sign if the centers of curvature lie on opposite sides of the surface). 


We note that the sign of a manifold’s curvature at a point does not depend 
on the orientation of the manifold; this sign may be defined without using the 
orientation at all. 

Namely, on manifolds of positive curvature, a vector parallel translated 
around the boundary of a small region turns around its origin in the same 
direction as the point on the boundary goes around the region; on manifolds 
of negative curvature the direction of rotation is opposite. 

We note further that the value of the curvature at a point is determined 
by the metric in a neighborhood of this point, and therefore is preserved 
under bending: the curvature is the same at corresponding points of iso- 
metric surfaces. Hence, riemannian curvature is also called intrinsic curvature. 

The formulas for computing curvature in terms of components of the 
metric in some coordinate system involve the second derivatives of the metric 
and are rather complicated: cf. the problems in Section G below. 


D Higher-dimensional parallel translation 


The construction of parallel translation on riemannian manifolds of di- 
mension greater than two is somewhat more complicated than the two- 
dimensional construction presented above. The reason is that in these 
dimensions the direction of the vector being translated is no longer determined 
by the condition that the angle with a geodesic be invariant. In fact, the vector 
could rotate around the direction of the geodesic while preserving its angle 
with the geodesic. 

The refinement which we must introduce into the construction of parallel 
translation along a geodesic is the choice of a two-dimensional plane passing 
through the tangent to the geodesic, which must contain the translated vector. 
This choice is made in the following (unfortunately complicated) way. 

At the initial point of a geodesic the needed plane is the plane spanned by 
the vector to be translated and the direction vector of the geodesic. We look 
at all geodesics proceeding from the initial point, in directions lying in this 
plane. The set of all such geodesics (close to the initial point) forms a smooth 
surface which contains the geodesic along which we intend to translate the 
vector (Figure 233). 

Consider a new point on the geodesic at a small distance A from the initial 
point. The tangent plane at the new point to the surface described above 
contains the direction of the geodesic at this new point. We take this new 
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A 


Figure 233 Parallel translation in space 


point as the initial point and use its tangent plane to construct a new surface 
(formed by the bundle of geodesics emanating from the new point). This 
surface contains the original geodesic. We move along the original geodesic 
again by A and repeat the construction from the beginning. 

After a finite number of steps we can reach any point of the original geo- 
desic. Asa result of our work we have, at every point of the geodesic, a tangent 
plane containing the direction of the geodesic. This plane depends on the 
length A of the steps in our construction. As A > 0 the family of tangent 
planes obtained converges (as can be calculated) to a definite limit. As a 
result we have a field of two-dimensional tangent planes along our geodesic 
containing the direction of the geodesic and determined in an intrinsic 
manner by the metric on the manifold. 

Now parallel translation of our vector along a geodesic is defined as in the 
two-dimensional case: under translation the vector must remain in the planes 
described above; its length and its angle with the direction of the geodesic 
must be preserved. Parallel translation along any curve is defined using 
approximations by geodesic polygons, as in the two-dimensional case. 


Pros_eM. Show that parallel translation of vectors from one point of a riemannian manifold 
to another along a fixed path is a linear isometric operator from the tangent space at the first 
point to the tangent space at the second point. 
PRoBLEM. Parallel translate any vector along the line 
xX,=t x= y=l (0<t<zt) 
in a Lobachevsky space with metric 
dx? + dx3 + dy’ 


y? 


ds? 


ANSWER. Vectors in the directions of the x, and y axes are rotated by angle t in the plane spanned 
by them (rotation is in the direction from the y-axis towards the x-axis); vectors in the x-direc- 
tion are carried parallel to themselves in the sense of the euclidean metric. 


E The curvature tensor 


We now consider, as in the two-dimensional case, parallel translation along 
small closed paths beginning and ending at a point of a riemannian manifold. 
Parallel translation along such a path returns vectors to the original tangent 
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space. The map of the tangent space to itself thus obtained is a small rotation 
(an orthogonal transformation close to the identity). 


In the two-dimensional case we characterized this rotation by one number—the angle of rotation 
gy. In higher dimensions a skew-symmetric operator plays the role of ~. Namely, any orthogonal 
operator A which is close to the identity can be written in a natural way in the form 


2 


_ eu ee 
A=e Se 5 


where ® is a small skew-symmetric operator. 


PRoBLEM. Compute © if A is a rotation of the plane through a small angle g. 


cos Pp sin ¢ 0 
pe ( lee p ie ( @\. 
—sing cose -yp 0 
Unlike in the two-dimensional case, the function ® is not generally additive (since the 
orthogonal group of n-space for n > 2 is not commutative). Nevertheless, we can construct a 
curvature form using ®, describing the “infinitely small rotation caused by parallel translation 


around an infinitely small parallelogram” in the same way as in the two-dimensional case, ie., 
using formula (2). 


ANSWER. 


Thus, let € and y in TM,, be vectors tangent to the riemannian manifold 
M at the point x. Construct a small curvilinear parallelogram I, on M (the 
sides of the parallelogram IT, are obtained from the vectors e¢ and ey by a 
coordinate identification of a neighborhood of zero in TM, with a neighbor- 
hood of x in M). We will look at parallel translation along the sides of the 
parallelogram IT, (we begin the circuit at €). 

The result of translation will be an orthogonal transformation of TM,, 
close to the identity. It differs from the identity transformation by a quantity 
of order ¢? and has the form 


A(é, 9) = E + &?Q 4+ o(e?), 


where Q is a skew-symmetric operator depending on € and n. Therefore, we 

can define a function Q of pairs of vectors €, in the tangent space at x with 
values in the space of skew-symmetric operators on TM,, by the formula 

: AX, -—E 

26,1) = tim AD =F 


e>0 


PROBLEM. Show that the function Q is a differential 2-form (with values in the skew-symmetric 
operators on TM,) and does not depend on the choice of coordinates we used to identify TM, 
and M. 


The form Q is called the curvature tensor of the riemannian manifold. 
We could say that the curvature tensor describes the infinitesimal rotation 
in the tangent space obtained by parallel translation around an infinitely 
small parallelogram. 
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F Curvature in a two-dimensional direction 


Consider a two-dimensional subspace L in the tangent space to a riemannian 
manifold at some point. We take geodesics emanating from this point in 
all the directions in L. These geodesics form a smooth surface close to our 
point. The surface constructed lies in the riemannian manifold and has an 
induced riemannian metric. 

By the curvature of a riemannian manifold M in the direction of a 2-plane 
Lin the tangent space to M at a point x, we mean the riemannian curvature at 
x of the surface described above. 


PROBLEM. Find the curvatures of a three-dimensional sphere of radius R and of Lobachevsky 
space in all possible two-dimensional directions. 


ANSWER. R~?, —1. 


In general, the curvatures of a riemannian manifold in different two- 
dimensional directions are different. Their dependence on the direction is 
described by formula (3) below. 


Theorem. The curvature of a riemannian manifold in the two-dimensional 
direction determined by a pair of orthogonal vectors €, n of length 1 can be 
expressed in terms of the curvature tensor Q by the formula 


(3) K = QE, nen, 


where the brackets denote the scalar product giving the riemannian metric. 


The proof is obtained by comparing the definitions of the curvature tensor and of curvature 
in a two-dimensional direction. We will not go into it in a rigorous way. It is possible to take 
formula (3) for the definition of the curvature K. 


G Covariant differentiation 


Connected with parallel translation along curves in a riemannian manifold 
is a particular differential calculus—so-called covariant differentiation, or 
the riemannian connection. We define this differentiation in the following 
way. 

Let ¢ be a vector tangent to a riemannian manifold M at a point x, and v 
a vector field given on M in a neighborhood of x. The covariant derivative 
of the field v in the direction € is defined by using any curve passing through x 
with velocity ¢. After moving along this curve for a small interval of time t, 
we find ourselves at a new point x(t). We take the vector field v at this point 
x(t) and parallel translate it backwards along the curve to the original point 
x. We obtain a vector depending on t in the tangent space to M at x. For 
t = 0 this vector is v(x), and for other ¢ it changes according to the non- 
parallelness of the vector field v along our curve in the direction €. 
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Consider the derivative of the resulting vector with respect to t, evaluated 
at t = 0. This derivative is a vector in the tangent space TM,,. It is called the 
covariant derivative of the field v along € and is denoted by Vv. It is easy to 
verify that the vector V.v does not depend on the choice of curve specified in 
the definition, but only on € and v. 


PROBLEM 1. Prove the following properties of covariant differentiation: 


1. Vv is a bilinear function of ¢ and v. 

2. Vifv = (Lef)v + f(x)V.v, where f is a smooth function and Lf is the derivative of f in the 
direction of the vector ¢ inTM,. 

3. L<v, w> = (Vzv, w(x)> + <v(x), Vew). . 

4. Vugw — Virnv = Lw, v] (x) (where Ly.) = Ly Ly — Ly Ly). 


PROBLEM 2. Show that the curvature tensor can be expressed in terms of covariant differentiation 
in the following way: 


lo, Noo = —VeVC + V,VEl + Vins 


where ¢, 7, ¢ are any vector fields whose values at the point under consideration are €9, 9, and Cg. 


Pros_eM 3. Show that the curvature tensor satisfies the following identities: 
QE, mo + ACN, OVE + QA, On = 0 
«QE, ma, BY = <Q(a, BYE, n>. 


PROBLEM 4. Suppose that the riemannian metric is given in local coordinates x,,..., x, by the 
symmetric matrix g;;: 


ds? =) g,jdx;dx,. 


Denote by e;,..., e, the coordinate vector fields (so that differentiation in the direction e; is 
; = 0/0x;). Then covariant derivatives can be calculated using the formulas in Problem 1 and 
the following formulas: 


V..e) = » The. Ts = >» 46.941 + Ogi — 49:9", 
k 7 


where (g'*) is the inverse matrix to (g,,). 

By using the expression for the curvature tensor in terms of the connection in Problem 2, 
we also obtain an explicit formula for the curvature. The numbers R jj: = (Q(e;, e;)e,, e1) are 
called the components of the curvature tensor. 


H The Jacobi equation 


The riemannian curvature of a manifold is closely connected with the be- 
havior of its geodesics. In particular, let us consider a geodesic passing 
through some point.in some direction, and alter slightly the initial conditions, 
i.e., the initial point and initial direction. The new initial conditions determine 
a new geodesic. At first this geodesic differs very little from the original geo- 
desic. To investigate the divergence it is useful to linearize the differential 
equation of geodesics close to the original geodesic. The second-order linear 
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differential equation thus obtained (“the variational equation” for the equa- 
tion of geodesics) is called the Jacobi equation; it is convenient to write 
it in terms of covariant derivatives and curvature tensors. 

We denote by x(t) a point moving along a geodesic in the manifold M 
with velocity (of constant magnitude) v(t)€ TM,,. If the initial condition 
depends smoothly on a parameter a, then the geodesic also depends smoothly 
on the parameter. Consider the motion corresponding to a value of « We 
denote the position of a point at time ¢ on the corresponding geodesic by 
x(t, «) € M. We will assume that the initial geodesic corresponds to the zero 
value of the parameter, so that x(t, 0) = x(t). 

The vector field of geodesic variation is the derivative of the function 
x(t, «) with respect to a, evaluated at « = 0; the value of this field at the point 
x(t) is equal to 

a x(t, ©) = (the TMyay. 
da a=0 

To write the variational equation, we define the covariant derivative with 
respect to t of a vector field C(t) given on the geodesic x(t). To define this, we 
take the vector ¢(t + h), parallel translate it from the point x(t + h) to 
x(t) along the geodesic, differentiate the vector obtained in the tangent space 
TM, with respect to h and evaluate at h = 0. The result is a vector in 
TM,,, which is called the covariant derivative of the field ¢(t) with respect 
to t, and denoted by D¢/Dt. 


Theorem The vector field of geodesic variation satisfies the second-order linear 
differential equation 
D7é 
sz = — QD, Op, 
(4) BF = M8) 


where Q is the curvature tensor, and v = v(t) is the velocity vector of motion 
along the original geodesic. 

Conversely, every solution of the differential equation (4) is a field of 
variation of the original geodesic. 


Equation (4) is called the Jacobi equation. 
PROBLEM. Prove the theorem above. 
PRosLeM. Let M be a surface, y(t) the magnitude of the component of the vector €(t) in the direc- 


tion normal to a given geodesic, and let the length of the vector v(t) be equal to 1. Show that y 
satisfies the differential equation 


(5) y= —Ky, 


where K = K(t) is the riemannian curvature at the point x(t) 


ProsBLeM. Using Equation (5), compare the behavior of geodesics close to a given one on the 
sphere (K = +R~*) and on the Lobachevsky plane (K = —1). 
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I Investigation of the Jacobi equation 


In investigating the variational equations, it is useful to disregard the trivial 
variations, i.e., changes of the time origin and of the magnitude of the initial 
velocity of motion. To this end we decompose the variation vector & into 
components parallel and perpendicular to the velocity vector v. Then (since 
Q(v, v) = 0 and since the operator Q(v, €) is skew-symmetric) for the normal 
component we again get the Jacobi equation, and for the parallel component 
we get the equation 


Dee | 


DE 0. 


We now note that the Jacobi equation for the normal component can be 
written in the form of “Newton’s equation” 


D? 
- = —grad U, 


where the quadratic form U of the vector € is expressed in terms of the curva- 
ture tensor and is proportional to the curvature K in the direction of the 


(é, v) plane: 
U(S) = <Q, Sv, &> = KE, &> <0, v). 


Thus the behavior of the normal component of the variation vector of a 
geodesic with velocity 1 can be described by the equation of a (non-autono- 
mous) linear oscillator whose potential energy is equal to the product of the 
curvature in the direction of the plane of velocity vectors and variations with 
the square of the length of the normal component of the variation. 

In particular we consider the case when the curvature is negative in all 
two-dimensional directions containing the velocity vector of the geodesic 
(Figure 234). Then the divergence of nearby geodesics from the given one in 


K>0 K <0 
ee 


eee 
Figure 234 Nearby geodesics on manifolds of positive and negative curvature 
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the normal direction can be described by the equation of an oscillator with 
negative definite (and time-dependent) potential energy. Therefore, the 
normal component of divergence for nearby geodesics behaves like the di- 
vergence of a ball, located near the top ofa hill, from the top. The equilibrium 
position of the ball at the top is unstable. This means that geodesics near the 
given geodesic will diverge exponentially from it. 


If the potential energy of the newtonian equation we obtained did not depend on time, our 
conclusion would be rigorous. Let us assume further that the curvature in the different direc- 
tions containing v is in the interval 


—a? < K < —b’, where0 <b <a. 


Then solutions to the Jacobi equation for normal divergence will be linear combinations of 
exponential curves with exponent +4;, where the positive numbers A; are between a and b. 
Therefore, every solution to the Jacobi equation grows at least as fast as el as either 
t— +00 ort > —: most solutions grow even faster, with rate el, 


The instability of an equilibrium position under negative definite potential 
energy is intuitively obvious also in the non-autonomous case. It can be 
proven by comparison with a corresponding autonomous system. As a 
result of such a comparison we may convince ourselves that under motion 
along a geodesic, all solutions of the Jacobi equation for normal divergence 
on a manifold of negative curvature grow at least as fast as an exponential 
function of the distance traveled, whose exponent is equal to the square 
root of the absolute value of the curvature in the two-dimensional direction 
for which this absolute value is minimal. In fact, most solutions grow even 
faster, but we cannot now assert that the exponent of growth for most solu- 
tions is determined by the direction in which the absolute value of the nega- 
tive curvature is largest. 

In summary, we can say that the behavior of geodesics on a manifold of 
negative curvature is characterized by exponential instability. For numerical 
estimates of this instability, it is useful to define the characteristic path length 
s as the average path length on which small errors in the initial conditions 
are increased e times. 

More precisely, the characteristic path length s can be defined as the inverse 
of the exponent 4 which characterizes the growth of the solution to the Jacobi 
equation for normal divergence from the geodesic proceeding with velocity 1: 


=r] 1 
A = lim —max max In|€(t)| S=-. 
Too TF i<T \e0)|=1 A 


In general, the exponent A and the path s depend on the initial geodesic. 

If the curvature of our manifold in all two-dimensional directions is 
bounded away from zero by the number —b’, then the characteristic path 
length is less than or equal to b~'. Thus as the curvature of a manifold gets 
more negative, the characteristic path length s, on which the instability of 
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geodesics is reduced to e-fold growth of error, gets smaller. In view of the 
exponential character of the growth of error, the course of a geodesic on a 
manifold of negative curvature is practically impossible to predict. 

Assume, for example, that the curvature is negative and bounded away 
from zero by —4m~?. The characteristic path length is less than or equal to 
halfa meter, i.e., on a geodesic arc five meters long the error grows by approxi- 
mately e!° ~ 10+. Therefore, an error of a tenth of a millimeter in the initial 
conditions shows up in the form of a one-meter difference at the end of the 
geodesic. 


J Geodesic flows on compact manifolds of 
negative curvature 


Let M be a compact riemannian manifold whose curvature at every point 
in every two-dimensional direction is negative. (Such manifolds exist.) 
Consider the inertial motion of a point of mass 1 on M, without any external 
forces. The lagrangian function of this system is equal to the kinetic energy, 
which is equal to the total energy and is a first integral of the equations of 
motion. 

If M has dimension n, then each energy level manifold has dimension 
2n — 1. This manifold is a submanifold of the tangent bundle of M. For 
example, we can fix the value of the energy at 4 (which corresponds to initial 
velocity 1). Then the velocity vector of the point has length constantly equal 
to 1, and our level manifold turns out to be the fiber bundle 


T,McTM 


consisting of the unit spheres in the tangent spaces to M at every point. 

Thus, a point of the manifold T,M is represented as a vector of length 
1 at a point of M. By the Maupertuis—Jacobi principle, we can describe the 
motion of a point mass with fixed initial conditions in the following way: 
the point moves with velocity 1 along the geodesic determined by the indi- 
cated vector. 

By the law of conservation of energy the manifold T,M is an invariant 
manifold in the phase space of our system. Therefore, our phase flow de- 
termines a one-parameter group of diffeomorphisms on the (2n — 1)- 
dimensional manifold T,M. This group is called the geodesic flow on M. 
The geodesic flow can be described as follows: the transformation at time t 
carries the unit vector ¢€ T,M located at the point x, to the unit velocity 
vector of the geodesic coming from x in the direction €, located at the point 
at distance t from x. We note that there is a naturally defined volume element 
on T,M and that the geodesic flow preserves it (Liouville’s theorem). 

Up to now we have not used the negative curvature of the manifold M. 
But if we investigate the trajectories of the geodesic flow, it turns out that the 
negative curvature of M has a strong impact on the behavior of these tra- 
jectories (this is related to the exponential instability of geodesics on M). 
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Here are some properties of geodesic flows on manifolds of negative 
curvature (for further details, see the book of D. V. Anosov cited earlier). 


1. Almost all phase trajectories are dense in the energy level manifold (the 
exceptional non-dense trajectories form a set of measure zero). 

2. Uniform distribution: the amount of time which almost every trajectory 
spends in any region of the phase space T, M is proportional to the volume 
of the region. 

3. The phase flow g' has the mixing property: if A and B are two regions, then 


lim mes[(g'A) 7 B] = mes A mes B 


t>oa 


(where mes denotes the volume, normalized by the condition that the 
whole space have measure 1). 


From these properties of trajectories in phase space follow analogous 
statements about geodesics on the manifold itself. Physicists call these 
properties “stochastic”: asymptotically for large t the trajectories behave as 
if the point were random. For example, the mixing property means that the 
probability of turning up in B at a time t long after exiting from A is propor- 
tional to the volume of B. 

Thus, the exponential instability of geodesics on manifolds of negative 
curvature leads to the stochasticity of the corresponding geodesic flow. 


K_ Other applications of exponential instability 


The exponential instability property of geodesics on manifolds of negative 
curvature has been studied by many authors, beginning with Hadamard (and, 
in the case of constant curvature, also by Lobachevsky), but especially by 
E. Hopf. An unexpected discovery of the 1960s in this area was the surprising 
stability of exponentially unstable systems with respect to perturbations of the 
systems themselves. 

Consider, for example, the vector field giving the geodesic flow on a com- 
pact surface of negative curvature. As we showed above, the phase curves 
of this flow are arranged in a complicated way: almost every one of them is 
dense in the three-dimensional energy level manifold. The flow has infinitely 
many closed trajectories, and the set of points on closed trajectories is also 
dense in the three-dimensional energy level manifold. 

We now consider a nearby vector field. It turns out that, in spite of the 
complexity of the picture of phase curves, the entire picture with dense 
phase curves and infinitely many closed trajectories hardly changes at all if 
we pass to the nearby field. In fact, there is a homeomorphism close to the 
identity transformation which takes the phase curves of the unperturbed 
flow to the phase curves of the perturbed flow. 

Thus our complicated phase flow has the same property of “structural 
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stability” as a limit cycle, or a stable focus in the plane. We note that neither 
a center in the plane nor a winding of the torus has this property of structural 
stability: the topological type of the phase portrait in these cases changes 
for arbitrarily small changes in the vector field. 

The existence of structurally stable systems with complicated motions, 
each of which is in itself exponentially unstable, is one of the basic discoveries 
of recent years in the theory of ordinary differential equations (the con- 
jecture that geodesic flows on manifolds of negative curvature are structurally 
stable was made by S. Smale in 1961, and the proof was given by D. V. 
Anosov and published in 1967; the basic results on stochasticity of these 
flows were obtained by Ya. G. Sinai and D. V. Anosov, also in the 1960s). 

Before these works most mathematicians believed that in systems of 
differential equations in “general form” only the simplest stable limiting 
behaviors were possible: equilibrium positions and cycles. If a system was 
more complicated (for example, if it was conservative), then it was assumed 
that after a small change in its equations (for example, after imposing small 
non-conservative perturbations) complicated motions are “dispersed” into 
simple ones. We now know that this is not so, and that in the function space 
of vector fields there are whole regions consisting of fields with more com- 
plicated behavior of phase curves. 

The conclusions which follow from this are relevant to a wide range of 
phenomena, in which “stochastic” behavior of deterministic objects is 
observed. 

Namely, suppose that in the phase space of some (non-conservative) 
system there is an attracting invariant manifold (or set) in which the phase 
curves have the property of exponential instability. We now know that 
systems with such a property are not exceptional: under small changes of the 
system this property must persist. What is seen by an experimenter observing 
motions of such a system? 

The approach of phase curves to an attracting set will be interpreted as 
the establishment of some sort of limiting conditions. The further motion ofa 
phase point near the attracting set will involve chaotic, unpredictable changes 
of “phase” of the limiting behavior, perceptible as “stochasticity” or 
“turbulence.” 

Unfortunately, no convincing analysis from this point of view has yet 
been developed for physical examples of a turbulent character. A primary 
example is the hydrodynamic instability of a viscous fluid, described by the 
so-called Navier-Stokes equations. The phase space of this problem is 
infinite-dimensional (it is the space of vector fields with divergence 0 in the 
domain of fluid flow), but the infinite-dimensionality of the problem is 
apparently not a serious obstacle, since the viscosity extinguishes the high 
harmonics (small vortices) faster and faster as the harmonics are higher and 
higher. As a result, the phase curves from the infinite-dimensional space 
seem to approach some finite-dimensional manifold (or set), to which the 
limit regime also belongs. 
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For large viscosity, we have a stable attracting equilibrium position in the 
phase space (“stable stationary flow”). As the viscosity decreases it loses sta- 
bility; for example, a stable limit cycle can appear in phase space (“periodic 
flow”) or a stable equilibrium position of a new type (“secondary stationary 
flow”).°° As the viscosity decreases further, more and more harmonics come 
into play, and the limit regime can become ever higher in dimension. 

For small viscosity, the approach to a limit regime with exponentially 
unstable trajectories seems very likely. Unfortunately, the corresponding 
calculations have not yet been carried out due to the limited capacity of 
existing computers. However, the following general conclusion can be drawn 
without any calculations: turbulent phenomena may appear even if solutions 
exist and are unique; exponential instability, which is encountered even in 
deterministic systems with a finite number of degrees of freedom, is sufficient. 

As one more example of an application of exponential instability we men- 
tion the proof announced by Ya. G. Sinai of the “ergodic hypothesis” of 
Boltzmann for systems of rigid balls. The hypothesis is that the phase flow 
corresponding to the motion of identical absolutely elastic balls in a box with 
elastic walls is ergodic on connected energy level sets. (Ergodicity means that 
almost every phase curve spends an amount of time in every measurable 
piece of the level set proportional to the measure of that piece.) 

Boltzmann’s hypothesis allows us to replace time averages by space 
averages, and was for a long time considered to be necessary to justify 
statistical mechanics. In reality, Boltzmann’s hypothesis (in which it is a 
question of a limit as time approaches infinity) is not necessary for passing 
to the statistical limit (the number of pieces approaches infinity). However, 
Boltzmann’s hypothesis inspired the entire analysis of the stochastic proper- 
ties of dynamical systems (so-called ergodic theory), and its proof serves as a 
measure of the maturity of this theory. 

The exponential instability of trajectories in Boltzmann’s problem arises 
as a result of collisions of the balls with one another, and can be explained 
in the following way. For simplicity, we will consider a system of only two 
particles in the plane, and will represent a square box with reflection off the 
walls by the planar torus {(x, y)mod 1}. Then we can consider one of the par- 
ticles as stationary (using the conservation of momentum); the other particle 
can be considered as a point. 

In this way we arrive at the model problem of motion ofa point on a toral 
billiard table with a circular wall in the middle from which the point is re- 
flected according to the law “the angle of incidence is equal to the angle of 
reflection” (Figure 235). 

To investigate this system we look at an analogous billiard table bounded 
on the outside by a planar convex curve (e.g., the motion of a point inside an 
ellipse). Motion on such a billiard table can be considered as the limiting 
case of the geodesic flow on the surface of an ellipsoid. Passage to the limit 


96 A more detailed account of loss of stability is given in “Lectures on bifurcations and versal 
families,” Russian Math. Surveys 27, no. 5 (1972), 55-123. 
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Figure 235 Torus-shaped billiard table with scattering by a circular wall 


consists of decreasing the smallest axis of the ellipsoid to zero. As a result, 
geodesics on the ellipsoid become billiard trajectories on the ellipse. We 
discover from this that the ellipse can reasonably be thought of as two-sided 
and that, under every reflection, the geodesic goes from one side of the ellipse 
to the other. 

We now return to our toral billiard table. Motion on it can be looked at as 
the limiting case of the geodesic flow on a smooth surface. This surface is 
obtained from looking at the torus with a hole as a two-sided surface, giving 
it some thickness and slightly smoothing the sharp edge. As a result we have a 
surface with the topology of a pretzel (a sphere with two handles). 

After blowing up the ellipse into the ellipsoid we obtain a surface of 
positive curvature; after blowing up the torus with a hole we get a surface of 
negative curvature (in both cases the curvature is concentrated close to the 
edge, but the blowing up can be done so that the sign of the curvature does 
not change). Thus motion in our toral billiard table can be looked at as the 
limiting case of motion along geodesics on a surface of negative curvature. 

Now, to prove Boltzmann’s conjecture (in the simple case under con- 
sideration) it is sufficient to verify that the analysis of stochastic properties 
of geodesic flows on surfaces of negative curvature holds in the indicated 
limiting case. 

A more detailed presentation of the proof turns out to be very complicated; 
it has been published only for the case of systems of two particles (Ya. G. 
Sinai, Dynamical systems with elastic reflections, Russian Mathematical 
Surveys, 25, no. 2 (1970), 137-189). 
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and the hydrodynamics of ideal fluids 


Eulerian motion of a rigid body can be described as motion along geodesics 
in the group of rotations of three-dimensional euclidean space provided with 
a left-invariant riemannian metric. A significant part of Euler’s theory 
depends only upon this invariance, and therefore can be extended to other 
groups. 

Among the examples involving such a generalized Euler theory are motion 
of a rigid body in a high-dimensional space and, especially interesting, the 
hydrodynamics of an ideal (incompressible and inviscid) fluid. In the 
latter case, the relevant group is the group of volume-preserving diffeo- 
morphisms of the domain of fluid flow. In this example, the principle of least 
action implies that the motion of the fluid is described by the geodesics in the 
metric given by the kinetic energy. (If we wish, we can take this principle to be 
the mathematical definition of an ideal fluid.) It is easy to verify that this 
metric is (right) invariant. 

Of course, extending results obtained for finite-dimensional Lie groups 
to the infinite-dimensional case should be done with care. For example, in 
three-dimensional hydrodynamics an existence and uniqueness theorem for 
solutions of the equations of motion has not yet been proved. Nevertheless, 
it is interesting to see what conclusions can be drawn by formally carrying 
over properties of geodesics on finite-dimensional Lie groups to the infinite- 
dimensional case. These conclusions take the character of a priori statements 
(identities, inequalities, etc.) which should be satisfied by all reasonable 
solutions. In some cases, the formal conclusions can then be rigorously 
justified directly, without infinite-dimensional analysis. 

For example, the Euler equations of motion for a rigid body have as their 
analogue in hydrodynamics the Euler equations of motion of an ideal fluid. 
Euler’s theorem on the stability of rotations around the large and small axes 
of the inertia ellipsoid corresponds in hydrodynamics to a slight generaliza- 
tion of Rayleigh’s theorem on the stability of flows without inflection points 
of the velocity profile. 

It is also easy to extract from Euler’s formulas an explicit expression for 
the riemannian curvature of a group with a one-sided invariant metric. 
Applying this to hydrodynamics we find the curvature of the group of dif- 
feomorphisms preserving the volume element. It is interesting to note that in 
sufficiently nice two-dimensional directions, the curvature turns out to be 
finite and, in many cases, negative. Negative curvature implies exponential 
instability of geodesics (cf. Appendix 1). In the case under consideration, the 
geodesics are motions of an ideal fluid; therefore the calculation of the 
curvature of the group of diffeomorphisms gives us some information on the 
instability of ideal fluid flow. In fact, the curvature determines the character- 
istic path length on which differences between initial conditions grow by e. 
Negative curvature leads to practical indeterminacy of the flow: on a path 
only a few times longer than the characteristic path length, a deviation in 
initial conditions grows 100 times larger. 
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In this appendix, we will briefly set out the results of calculations related 
to geodesics on groups with one-sided (right- or left-) invariant metrics. 
Proofs and further details can be found in the following places: 


V. Arnold, Sur la géométrie différentielle des groupes de Lie de dimension infinie et ses applica- 
tions a ’hydrodynamique des fluides parfaits. Annales de l'Institut Fourier, XVI, no. | 
(1966), 319-361. 


V. I. Arnold, An a priori estimate in the theory of hydrodynamic stability, Izv. Vyssh. Uchebn. 
Zaved. Matematika 1966, no. 5 (54), 3-5. (Russian) 


V. 1. Arnold, The Hamiltonian nature of the Euler equations in the dynamics of a rigid body and 
of an ideal fluid, Uspekhi Matematicheskikh Nauk, 24 (1969), no. 3 (147) 225-226. 
(Russian) 


L. A. Dikii, A remark on Hamiltonian systems connected with the rotation group, Functional 
Analysis and Its Applications, 6:4 (1972) 326-327. 


D. G. Ebin, J. Marsdcz, Groups of diffeomorphisms and the motion of an incompressible fluid, 
Annals of Math. 92, no. 1 (1970), 102-163. 


O. A. Ladyzhenskaya, On the local solvability of non-stationary problems for incompressible 
ideal and viscous fluids and vanishing viscosity, Boundary problems in mathematical 
physics, v. 5 (Zapiski nauchnikh seminarov LOML, v. 21), “ Nauka,” 1971, 65-78. (Russian) 


A. S. Mishchenko, Integrals of geodesic flows on Lie groups, Functional Analysis and Its Ap- 
plications, 4, no. 3 (1970), 232-235. 


A. M. Obukhov, On integral invariants in systems of hydrodynamic type, Doklady Acad. Nauk. 
184, no. 2 (1969). (Russian) 


L. D. Faddeev, Towards a stability theory of stationary planar-parallel flows of an ideal fluid, 
Boundary problems in mathematical physics, v. 5 (Zapiski nauchnikh seminarov LOMI, 
v. 21), “Nauka,” 1971, 164-172. (Russian) 


A Notation: The adjoint and co-adjoint representations 


Let G be a real Lie group and g its Lie algebra, i.e., the tangent space to the 
group at the identity provided with the commutator bracket operation 


C, ]. 


A Lie group acts on itself by left and right translation: every element g 
of the group G defines diffeomorphisms of the group onto itself: 


L,:G>G Lh = gh R,=G>G R,h = hg. 
The induced maps of the tangent spaces will be denoted by 
Lyx TG, > TG, and Ryy: TG, > TGrg 


for every h in G. 

The diffeomorphism R,-:L, is an inner automorphism of the group. It 
leaves the group identity element fixed. Its derivative at the identity is a 
linear map from the algebra (ie., the tangent space to the group at the 
identity) to itself. This map is denoted by 


Adjig 7G Ad, = (Rg-1 Lg) ye 
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and is called the adjoint representation of the group. It is easy to verify that 
Ad, is an algebra homomorphism, i.e., that 


Ad,L¢, n] = [Ad, G; Ad,n], © neg. 


It is also clear that Ad,, = Ad, Ad,. 
Wecan consider Ad as a map of the group into the space of linear operators 
on the algebra: 


Ad(g) = Ad,. 


The map Ad is differentiable. Its derivative at the identity of the group is a 
linear map from the algebra gq to the space of linear operations on g. This 
map is denoted by ad, and its image on an element ¢ in the algebra by ad,. 
Thus ad; is an endomorphism of the algebra space, and we have 


d 
ad = Ad,.:g —~ Endg adz=—| Adas, 
dt t=0 
where e’? is the one-parameter group with tangent vector ¢. From the formula 
written above it is easy to deduce an expression for ad in terms of the algebra 
alone: 


aden = [E,n]. 


We now consider the dual vector space g* to the Lie algebra g. This is 
the space of real linear functionals on the Lie algebra. In other words, g* 
is the cotangent space to the group at the identity, g* = T*G,. The value 
of an element ¢ of the cotangent space to the group at some point g on an 
element 7 of the tangent space at the same point will be denoted by round 
brackets: 


(E,nveR, €€T*G,,neETG,. 


Left and right translation induce operators on the cotangent space dual 
to L,, and R,,. We denote them by 


L*: T*Gy,> T*G, and Rf: T*G,, > T*G, 
for every h in G. These operators are defined by the identities 
(LF¢, n) = (¢, | n) and (RX, n) = (¢, Rox n). 


The transpose operators Ad*, where g runs through the Lie group G, form 
a representation of this group, i.e., they satisfy the relations 


Ad’, = Ad Ad¥. 


This representation is called the co-adjoint representation of the group and 
plays an important role in all questions related to (left) invariant metrics on 
the group. 

Consider the derivative of the operator Ad* with respect to g at the identity. 
This derivative is a linear map from the algebra to the space of linear operators 
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on the dual space to the algebra. This linear map is denoted by ad*, and its 
image on an element ¢ in the algebra is denoted by ad¥. Thus ad* is a linear 
operator on the dual space to the algebra, 


ad?: g* > g*. 
It is easy to see that ad is the adjoint of ad,: 
(adn, C) = (n, ad.) for ally eg*,Ceg. 
It is sometimes convenient to denote the action of ad* by braces: 


ad?n = {€,n}, where €€g,1 € g*. 


Thus braces mean the bilinear function from g x g* to g*, related to com- 
mutation in the algebra by the identity 


({, 1}, 0) = (7, [¢, £]). 


We consider now the orbits of the co-adjoint representation of the group 
in the dual space of the algebra. At each point of an orbit we have a natural 
symplectic structure (called the Kirillov form since A. A. Kirillov first used it 
to investigate representations of nilpotent Lie groups). Thus, the orbits of 
the co-adjoint representation are always even-dimensional. We also note 
that we obtain a series of examples of symplectic manifolds by looking at 
different Lie groups and all possible orbits. 

The symplectic structure on the orbits of the co-adjoint representation is 
defined by the following construction. Let x be a point in the dual space to 
the algebra and € a vector tangent at this point to its orbit. Since g* is a 
vector space, we can consider the vector ¢, which really belongs to the tangent 
space to g* at x, as lying in g*. 

The vector ¢ can be represented (in many ways) as the velocity vector of 
the motion of the point x under the co-adjoint action of the one-parameter 
group e” with velocity vector a € g. In other words, every vector tangent to 
the orbit of x in the co-adjoint representation of the group can be expressed 
in terms of a suitable vector a in the algebra by the formula 


& = {a, x}, aég,x€g*. 


Now we are ready to define the value of the symplectic 2-form Q on a pair 
of vectors ¢,, €, tangent to the orbit of x. Namely, we express €, and €, in 
terms of algebra elements a, and a, by the formula above, and then obtain 
the scalar 

(61, 62) = (x, [a1, a2), xEg*, a,Eg. 


It is easy to verify that (1) the bilinear form Q is well defined, i.e., its value does 
not depend on the choice of a;; (2) Q is skew-symmetric and therefore gives 
a differential 2-form Q on the orbit; and (3) Q is nondegenerate and closed 
(the proofs can be found, for instance, in Appendix 5). Thus the form Q is a 
symplectic structure on an orbit of the co-adjoint representation. 
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B Left-invariant metrics 


A riemannian metric on a Lie group G is called left-invariant if it is preserved 
by all left translations L,, i.e., if the derivative of left translation carries every 
vector to a vector of the same length. 

It is sufficient to give a left-invariant metric at one point of the group, for 
instance the identity; then the metric can be carried to the remaining points 
by left translations. Thus there are as many left-invariant riemannian metrics 
on a group as there are euclidean structures on the algebra. 

A euclidean structure on the algebra is defined by a symmetric positive- 
definite operator from the algebra to its dual space. Thus, let 4:g — g* be 
a symmetric positive linear operator: 


(Ag, 4) = (An, ¢), for all ¢, 7 ing. 


(It is not very important that A be positive, but in mechanical applications 
the quadratic form (Aé, €) is positive-definite.) 
We define a symmetric operator A,: TG, > T*G, by left translation: 


A,é = L*:AL,- 1x6. 


We thus obtain the following commutative diagram of linear operators: 


Ady 
oo, 
8 Lg-ts TG, Rg-tx 8 
A A, 
g* Ta T*G, =a g* 
g 
Ady 


We will denote by angled brackets the scalar product determined by the 
operator A,: 


<&m>q = (Ag, 0) = (Agn, &) = <n, O),- 


This scalar product gives a riemannian metric on the group G, invariant under 
left translations. The scalar product in the algebra will be denoted simply by 
< , >. We define an operation B:g x g — g by the identity 


<{a, b], c> = <B(c, a), b>, for all b ing. 


Clearly, this operation B is bilinear, and for fixed first argument is skew- 
symmetric in the second: 


«Bic, a). b> + <Bic, b), a> = 0. 
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C Example 


Let G = SO(3) be the group of rotations of three-dimensional euclidean 
space, i.e. the configuration space of a rigid body fixed at a point. A motion 
of the body is then described by a curve g = g(t) in the group. The Lie algebra 
of G is the three-dimensional space of angular velocities of all possible 
rotations. The commutator in this algebra is the usual vector product. 

A rotation velocity g of the body is a tangent vector to the group at the 
point g. To get the angular velocity, we must carry this vector to the tangent 
space of the group at the identity, i.e. to the algebra. But this can be done in 
two ways: by left and right translation. As a result, we obtain two different 
vectors in the algebra: 


@,= Ly-149€G and w, = R,-149 €g. 


These two vectors are none other than the “angular velocity in the body” and 
the “angular velocity in space.” 


An element g of the group G corresponds to a position of the body obtained by the motion g 
from some initial state (corresponding to the identity element of the group and chosen abritrar- 
ily). Let w be an element of the algebra. 

Let e*' be a one-parameter group of rotations with angular velocity w; w is the tangent 
vector to this one-parameter group at the identity. Now we look at the displacement 


o 


e°'yg, where g = g(t)€ G,weg,andt <1, 
obtained from the displacement g by a rotation with angular velocity w after a small time t. 
If the vector g coincides with the vector 


d 


dt t=O 


OT 


eg, 


then w is called the angular velocity relative to space and is denoted by w,. Thus w, is obtained 
from g by right translation. In an analogous way we can show that the angular velocity in 
the body is the left translate of the vector g in the algebra. 


The dual space g* to the algebra in our example is the space of angular 
momenta. 

The kinetic energy of a body is determined by the vector of angular velocity 
in the body and does not depend on the position of the body in space. There- 
fore, kinetic energy gives a left-invariant riemannian metric on the group. 
The symmetric positive-definite operator A,: TG, T*G, given by this 
metric is called the moment of inertia operator (or tensor). It is related to the 
kinetic energy by the formula T = 3<g, g>, = 3<@,, @. = }(A®,, @,) = 
3(A,g, 9), where A:g — g* is the value of A, for g = e. The image of the 
vector g under the action of the moment of inertia operator A, is called the 
angular momentum and is denoted by M = A,g. The vector M lies in the 
cotangent space to the group at the point g, and it can be carried to the co- 
tangent space to the group at the identity by both left and right translations. 
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We obtain two vectors 


M, = Lx¥Me g* 
and 


M, = R¥Me9* 


These vectors in the dual space to the algebra are none other than the 
angular momentum relative to the body (M,) and the angular momentum 
relative to space (M,). This follows easily from the expression for kinetic 
energy in terms of momentum and angular velocity: 


T = (M,, w.) = (M, g). 


By the principle of least action, the motion of a rigid body under inertia 
(with no external forces) is a geodesic in the group of rotations with the left- 
invariant metric described above. 

We will now look at a geodesic of an arbitrary left-invariant rlemannian 
metric on an arbitrary Lie group as a motion of a “generalized rigid body” 
with configuration space G. Such a “rigid body with group G” is determined 
by its kinetic energy, i.e., a positive-definite quadratic form on the Lie algebra. 
More precisely, we will consider geodesics of a left-invariant metric on a 
group G given by a quadratic form <w, w) on the algebra as motions of a 
rigid body with group G and kinetic energy (a, w>/2. 

To every motion t — g(t) of our generalized rigid body we can associate 
four curves: 


t+ w(theg t> athe g 
t> M(teg* t+ M,(tye 9*, 


called motions of the vectors of angular velocity and momentum in the body 
and in space. The differential equations which these curves satisfy were found 
by Euler for an ordinary rigid body. However, they are true in the most general 
case of an arbitrary group G, and we will call them the Euler equations for a 
generalized rigid body. 


Remark. In the ordinary theory of a rigid body six different three-dimen- 
sional spaces R*, R**, g, g*, TG,, and T*G, are identified. The fact that the 
dimensions of the space R? in which the body moves and of the Lie algebra g 
of its group of motions are the same is an accident related to the dimension 3; 
in the n-dimensional case, g has dimension n(n — 1)/2. 

The identification of the Lie algebra g with its dual space g* has a more 
profound basis. The fact is that on the group of rotations there exists (and is 
unique up to multiplication) a two-sided invariant riemannian metric. This 
metric gives once and for all a preferred isomorphism of the vector spaces g 
and g* (and also of TG, and T*G,). It allows us therefore to consider the 
vectors of angular velocity and momentum as lying in the same euclidean 
space. With this identification, the operation { , } 1s simply the commutator 
of the algebra, taken with a minus sign. 
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A two-sided invariant metric exists on any compact Lie group. Therefore, 
to study motions of rigid bodies with compact groups we may identify the 
spaces of angular velocities and momenta. However, we cannot make this 
identification for applications to non-compact (or infinite-dimensional) 
groups of diffeomorphisms. 


D Euler’s equation 


The results of Euler (obtained by him in the particular case G = SO(3)) can 
be formulated as the following theorems on the motion of the vectors of 
angular velocity and momentum of a generalized rigid body with group G. 


Theorem 1. The vector of angular momentum relative to space is preserved 
under motion: 
dM, 
dt 


Theorem 2. The vector of angular momentum relative to the body satisfies 
Euler’s equation 
dM, 
dt 


= {a,, M,}. 


These theorems are proved for a generalized rigid body in the same way as 
for an ordinary rigid body. 

Remark 1. The vector of angular velocity in the body, «,, can be expressed 
linearly in terms of the vector of angular momentum in the body, M,, by 
using the inverse of the inertia operator: w, = A~1M,. Therefore, Euler’s 
equation can be considered as an equation for the vector of angular mo- 
mentum in the body alone; its right-hand side is quadratic in M.,. 


We can also express this result in the following way. Consider the phase 
flow of our rigid body. (Its phase space T*G has dimension twice the dimen- 
sion n of the group G or the space of angular momenta g*.) Then this phase 
flow in a 2n-dimensional manifold factors over the flow given by Euler’s 
equation in the n-dimensional vector space g*. 


A factorization of a phase flow g' on a manifold X over a phase flow f‘ on a manifold Y 
is a smooth mapping x of X onto Y under which motions g' are mapped to motions J‘, so that 
the following diagram commutes (i.e., mg’ = f'n): 
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In our case, X = T*G is the phase space of the body, Y = g* is the space of angular momenta. 
The projection 2: T*G — g* is defined by left translation (tM = L*M for M € T*G,), g' is 
the phase flow of the body under consideration on the 2n-dimensional space T*G, and f" is the 
phase flow of the Euler equation in the n-dimensional space of angular momenta g*. 


In other words, a motion of the vector of angular momentum relative to 
the body depends only on the initial position of the vector of angular mo- 
mentum relative to the body and does not depend on the position of the 
body in the space. 

Remark 2. The law of conservation of the vector of angular momentum 
relative to space can be expressed by saying that every component of this 
vector in some coordinate system on the space g* is conserved. We thus 
obtain a set of first integrals of the equations of motion of the rigid body. In 
particular, to every element of the Lie algebra g there corresponds a linear 
function on the space g* and, therefore, a first integral. The Poisson brackets 
of first integrals given by functions on g* are themselves functions on g*, as 
can be seen easily. We thus obtain an (infinite-dimensional) extension of the 
Lie algebra g, consisting of all functions on g*. g itself is included in this 
extension as the Lie algebra of linear functions on g*. Of course, of all these 
first integrals of the phase flow in a 2n-dimensional space only n are func- 
tionally independent. As the n independent integrals we can take, for example, 
n linear functions on g* which form a basis in g. 


Because of possible infinite-dimensional applications, we would like to 
avoid coordinates and formulate statements about first integrals intrinsically. 
This can be done by reformulating Theorem 1 in the following way. 


Theorem 3. The orbits of the co-adjoint representation of a group in the dual 
space to the algebra are invariant manifolds for the flow in this space given 
by Euler’s equation. 


Proor. M(t) is obtained from M,(t) by the action of the co-adjoint repre- 
sentation, and M,(t) remains fixed. 


EXAMPLE. In the case of an ordinary rigid body, the orbits of the co-adjoint 
representation of the group in the space of momenta are the spheres 
M? + M3 + M3 = const. In this case Theorem 3 is reduced to the law of 
conservation of the length of the angular momentum. It consists of the fact 
that, if the initial point M, lies on some orbit (i-e., in the given case on the 
sphere M? = const), then all the points of its trajectory under the action of 
Euler’s equation lie on the same orbit. 


We now return to the general case of an arbitrary group G and recall that 
each orbit of the co-adjoint representation has a symplectic structure (cf. 
subsection A). Furthermore, the kinetic energy of the body can be expressed 
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in terms of the angular momentum relative to the body. As a result we obtain 
a quadratic form on the space of angular momenta 


T = 4(M,, A~!M,). 


Let us fix some one orbit V of the co-adjoint representation. We consider the 
kinetic energy as a function on this orbit: 


H:V>R, H(M,) = (M,, A~'M,). 


Theorem 4. On every orbit V of the co-adjoint representation, Euler’s equation 
is hamiltonian with hamiltonian function H. 


Proor. Every vector € tangent to V at a point M has the form € = {f, M}. where feg. In 
particular, the vector field on the right side of Euler’s equation can be written in the form 
X = {dT, M} (here the differential of the function T at a point M of the vector space g* is 
considered as a vector of the dual space to g’*, i-e., as an element of the Lie algebra g). It follows 
from the definitions of the symplectic structure Q and the operation { , } (cf. subsection A) 
that for every vector € tangent to V at M, 


Q¢, X) = (M,[f,dT]) = (aT, (f, M}) = dH, ¢). O 


Euler’s equation can be carried over from the dual space of the algebra to 
the algebra itself by inversion of the moment of inertia operator. As a result 
we obtain the following formulation of Euler’s equation in terms of the 
operation B (section B). 


Theorem 5. The motion of the vector of angular velocity in the body is deter- 
mined by the initial position of this vector and does not depend on the initial 
position of the body. The vector of angular velocity in the body satisfies an 
equation with quadratic right-hand side: 


©, = B(@,, ©). 


We will call this equation Euler’s equation for angular velocity. We 
notice that, under the action of the operator A~': g* > g, the orbits of the 
co-adjoint representation are carried to invariant manifolds of Euler’s 
equation for angular velocity; these manifolds have symplectic structure, etc. 
However, unlike orbits in g*, these invariant manifolds are not determined 
by the Lie group G itself, but depend also on the choice of rigid body (ie., 
moment of inertia operator). 

From the law of conservation of energy we have 


Theorem 6. Euler’s equations (for momentum and angular velocity) have a 
quadratic first integral, whose value is equal to the kinetic energy 


T = (M,, A~'M,) = H(Aw,, @,). 
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E Stationary rotations and their stability 


A stationary rotation of a rigid body is a rotation for which the angular 
velocity in the body is constant (and thus also the angular velocity in space; 
it is easy to see that one implies the other). We know from the theory of an 
ordinary rigid body in R? that stationary rotations are rotations around the 
major axes of the moment of inertia ellipsoid. Below, we formulate a general- 
ization of this theorem to the case of a rigid body with any Lie group. We note 
that stationary rotations are geodesics of left-invariant metrics which are one- 
parameter subgroups. We note also that the directions of the major axes of 
the inertia ellipsoid can be determined by looking at the stationary points of 
the kinetic energy on the sphere of vectors of momentum of fixed length. 


Theorem 7. The angular momentum (respectively, angular velocity) of a 
stationary rotation with respect to the body is a critical point of the energy 
on the orbit of the co-adjoint representation (respectively on the image of the 
orbit under the action of the operator A~*'). Conversely, every critical point 
of the energy on an orbit determines a stationary rotation. 


The proof is a straightforward computation or application of Theorem 4. 

We note that the partition of the space of momenta into orbits of the co- 
adjoint representation cannot be so easily constructed in the case of an 
arbitrary group as it was in the simple case of an ordinary rigid body; in that 
case it was the partition of three-dimensional space into spheres with center 
0 and the point 0 itself. In the general case, the orbits can have different 
dimensions, and the partition into orbits at some points may not be a 
fibering; such a singularity already appeared in the three-dimensional case 
at the point 0. 

We call a point M of the space of angular momenta a regular point if the 
partition of a neighborhood of M into orbits is diffeomorphic to a partition 
of euclidean space into parallel planes (in particular, all orbits near the point 
M have the same dimension). For example, for the group of rotations of 
three-dimensional space all points of the space of angular momenta are 
regular except the origin. 


Theorem 8. Suppose that a regular point M of the space of angular momenta is 
a critical point of the energy on an orbit of the co-adjoint representation, 
and that the second differential of the energy d*H at this point is a (positive- 
or negative-)definite form. Then M is a(Liapunov) stable equilibrium position 
of Euler’s equations. 


Proor. It follows from the regularity of the orbits near this point that on 
every neighboring orbit there exists near M a point which is a conditional 
maximum or minimum of energy. | 
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Theorem 9. The second differential of the kinetic energy, restricted to the image 
of an orbit of the co-adjoint representation in the algebra, is given at a 
critical point w€ g by the formula 


2d7H |e = <B(o, f), Bo, f)> + <Lf, o], Bo, f)>, 


where € is a tangent vector to this image, expressed in terms of f by the 
formula 


¢ = Bo, f), oe g. 


F Riemannian curvature of a group with 
left-invariant metric 


Let G be a Lie group provided with the left-invariant metric given by a 
scalar product < , > in the algebra. We note that the riemannian curvature 
of the group G at any point is determined by the curvature at the identity 
(since left translation maps the group to itself isometrically). Therefore, it is 
sufficient to calculate the curvature for two-dimensional planes lying in the 
Lie algebra. 


Theorem 10. The curvature of a group in the direction determined by an 
orthonormal pair of vectors €, n in the algebra is given by the formula 


Kz, = (6, 6> + 2a, BY — 3¢a, a> — 4¢Bz, B,», 


where 26 = B(é, n) + B(n, ¢), 2B = B(é,n) — B(n, ), 20 = [é, n], 2B; = 
BCE, ¢), 2B, = B(y, 1), and where B is the operation defined in section B. 


The proof is a tedious but straightforward calculation. It is based on the 
easily verified formula for covariant derivative 


(Vene = 206, 1] — BEE, n) — Bin, 6), 


where € and 7 on the left are left-invariant vector fields and on the right are 
their values at the identity. 

Remark 1. In the case of a two-sided invariant metric, the formula for 
curvature has the particularly simple form 


Kz, =, x06, n], [é, n)>. 


Remark 2. The formula for the curvature of a group with a right-invariant 
riemannian metric coincides with the formula for the left-invariant case. In 
fact, a right-invariant metric on a group is a left-invariant metric on the 
group with the reverse multiplication law (g, *g. = g2g,). Passage to the 
reverse group changes the signs of both the commutator and the operation B 
in the algebra. But, in every term of the formula for curvature, there is a 
product of two operations changing the sign. Therefore, the formula for 
curvature is the same in the right-invariant case. 
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In Euler’s equation the right-hand side changes sign under passage to the 
right-invariant case. 


G Application to groups of diffeomorphisms 


Let D be a bounded region in a riemannian manifold. Consider the group of 
diffeomorphisms of D which preserve the volume element. We will denote 
this group by SDiffD. 

The Lie algebra corresponding to the group SDiffD consists of all vector 
fields with divergence 0 on D, tangent to the boundary (if it is not empty). We 
define the scalar product of two elements of this Lie algebra (i.e., two vector 
fields) as 


(04, 02) = [e. * V2)dx, 


where (-) is the scalar product giving the riemannian metric on D, and dx 
is the riemannian volume element. 

We now consider the flow of a uniform ideal (incompressible, non- 
viscous) fluid on the region D. Such a flow is described by a curve t > g, in 
the group SDiffD. Namely, the diffeomorphism g, is the map which carries 
every particle of the fluid from the place it was at time 0 to the place it is at 
time t. It turns out that the kinetic energy of the moving fluid is a right- 
invariant riemannian metric on the group of diffeomorphisms SDiffD. 


Indeed, suppose that after time t the flow of the fluid gives a diffeomorphism g,, and that 
the velocity at this moment of time is given by the vector field v. Then the diffeomorphism 
realized by the flow after time t + t (where t is small) will be e’g, up to a quantity small in 
comparison with t (here e” is the one-parameter group with vector », i.e., the phase flow of the 
differential equation given by the field v). Therefore, the field of velocities v is obtained from the 
vector g tangent to the group at the point g by right translation. This also implies the right- 
invariance of the kinetic energy, which is by definition equal to 


T = 3X0, vy 


(we assume the density of the fluid to be 1). 


The principle of least action (which in mathematical terms is the definition 
of an ideal fluid) asserts that flows of an ideal fluid are geodesics in the right- 
invariant metric just described on the group of diffeomorphisms. 


Strictly speaking, an infinite-dimensional group of diffeomorphisms is not a manifold. 
Therefore the exact formulation of the definition above requires additional work: we must 
choose suitable functional spaces, prove a theorem on existence and uniqueness of solutions, 
etc. Up to now this has been done only in the case when the dimension of the region of the flow D 
is equal to 2. However, we will proceed as if these difficulties connected with infinite dimensions 
did not exist. Thus the following arguments are heuristic in character. It turns out that many 
of the results can be proved rigorously, independently of the theory of infinite-dimensional 
manifolds. 


We will now indicate the form that the general formulas introduced above 
take in the case G = SDiffD, where D is a connected region with finite 
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volume in a three-dimensional riemannian manifold. To do this we must 
first describe explicitly the bilinear operation B:g x g > g defined in 
section B by the formula 


<La, 6], c> = <B(c, a), b>. 


It is easy to verify that in the three-dimensional case the vector field 
B(c, a) can be expressed in terms of the vector fields a and c of our Lie algebra 
by the formula 

B(c, a) = (curl c) A a + grad a, 


where 4 denotes the vector product, and « the single-valued function on D 
which is uniquely (up to a constant summand) determined by the condition 
Beg (ie., the conditions div B = 0 and B is tangent to the boundary of D). 

We note that the operation B does not depend on the choice of orientation, 
since the vector product and curl both change sign with a change of orienta- 
tion. 

Stationary flows. Euler’s equation for “angular velocity” in the case 
G = SDiffD has the form 6 = — B(v, v), since the metric is right-invariant. 
Therefore, in the case of the group of diffeomorphisms of three-dimensional 
space, it takes the form of “the equations of motion in Bernoulli’s form” 

Ov : 

a 2 curl + grad a, divv = 0. 
Euler’s equation for momentum is written in the form of the “vorticity 


equation” 
0 curl v 


ot 


In particular, the vorticity of a stationary flow commutes with the field of 
velocities. 

This remark leads quickly to a topological classification of stationary 
flows of an ideal fluid in three-dimensional space. 


= [v, curl v]. 


Theorem 11. Assume that the region D is bounded by a compact analytic surface, 
and that the field of velocities is analytic and not everywhere collinear with 
its curl. Then the region of the flow can be partitioned by an analytic sub- 
manifold into a finite number of cells, in each of which the flow is constructed 
in a Standard way. Namely, the cells are of two types: those fibered into tori 
invariant under the flow and those fibered into surfaces invariant under the 
flow, diffeomorphic to the annulus R x S'. On each of these tori the flow 
lines are either all closed or all dense, and on each annulus all the flow lines 
are closed. 


To prove this theorem we look at the “Bernoulli surfaces,” i.e., the level 
surfaces of the function «. It follows from the condition for a flow to be 
stationary (v A curl v = —grad «) that both the flow lines and the vortex 
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lines lie on the Bernoulli surface. Since the fields of velocity and vorticity 
commute, the group R? acts on the closed Bernoulli surface, and it must be a 
torus (cf. the proof of Liouville’s theorem in Section 49). An analogous 
calculation for the boundary conditions on the boundary of D shows that the 
non-closed Bernoulli surfaces consist of annuli with closed flow lines. 

Remark. The analyticity of the field of velocities is not very essential, but 
it is important that the fields of velocity and vorticity not be collinear. 
Computer experiments conducted by M. Hénon show more complicated 
behavior than described in the theorem for the flow lines of a stationary flow 
on the three-dimensional torus; this field is given by the formulas 


v, =Asinz+Ccosy v,=Bsinx + Acosz, 
v, = Csin y + Bcos x. 


The formulas are selected so that the vectors v and curl v are collinear. The 
results of Hénon’s calculations suggest that some flow lines densely fill up a 
three-dimensional region. 


I Isovorticial fields 


Two-dimensional hydrodynamics differs sharply from three-dimensional 
hydrodynamics. The essence of this difference is contained in the difference 
in the geometries of the orbits of the co-adjoint representation in the two- 
and three-dimensional cases. In the two-dimensional case the orbits are in 
some sense closed and behave, for example, like a family of level sets of a 
function (more precisely of several functions: actually even an infinite number 
of functions). In the three-dimensional case the orbits are more complicated ; 
in particular, they are unbounded (and perhaps dense). The orbits of the co- 
adjoint representation of the group of diffeomorphisms of a three-dimensional 
riemannian manifold can be described in the following way. Let v, and v, be 
two vector fields of velocities of an incompressible fluid in the region D. 
We say that the fields v, and v, are isovorticial if there is volume-preserving 
diffeomorphism g: D > D which carries every closed contour y in D to a new 
contour such that the circulation of the first field along the original contour 
is equal to the circulation of the second field along the new contour: 


bv, = f v2. 
Y gy 


It is easy to verify that the image of an orbit of the co-adjoint representation 
in the algebra (under the action of the inverse of the inertia operator, A~*) is 
none other than the set of fields isovorticial to the given field. 

In particular, Theorem 3 now takes the form of the following law of con- 
servation of circulation: 


Theorem 12. The circulation of a field of velocities of an ideal fluid over a closed 
fluid contour does not change when the contour is carried by the flow to a 
new position. 
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We note that if two fields of velocities of a three-dimensional ideal fluid 
on D are isovorticial, then the corresponding diffeomorphism carries the curl 
of the first field into the curl of the second: 


gy curl v, = curl vp. 


Furthermore, the isovorticity of two fields can be defined as the equivalence 
of the fields of vorticity, if the region of the flow is simply connected. Therefore, 
the problem of the oribits of the co-adjoint representation in the three- 
dimensional case includes the problem of classifying vector fields with 
divergence zero up to volume-preserving diffeomorphisms. This last problem 
in three dimensions is hopelessly difficult. 

We now consider the two-dimensional case. First, we translate the basic 
formulas into notation convenient for considering the two-dimensional case. 
We assume that the region D of the flow is two-dimensional and oriented. 
The metric and orientation give a symplectic structure on D; the vector field 
of velocities has divergence zero and is therefore hamiltonian. Therefore, this 
field is given by a hamiltonian function (many-valued, in general, if the region 
D is not simply connected). The hamiltonian function of a field of velocities 
is called the stream function in hydrodynamics, and is denoted by y. Thus 


v =I grad y, 


where J is the operator of clockwise rotation by 90°. 

The stream function of the commutator of two fields turns out to be the 
jacobian (or the Poisson bracket of hamiltonian formalism) of the stream 
functions of the original fields 


Wvs, v2) = J, Wo). 
The vector field B(c, a) is given, in the two-dimensional case, by the formula 
B= —(Ay,)grad w, + grad a, 


where wy, and w, are the stream functions of the fields a and c, and A = 
div grad is the laplacian. 

In the particular case of the euclidean plane with cartesian coordinates x 
and y, the formulas for stream function, commutator and laplacian take the 
particularly simple form 
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The vorticity (or curl) of a two-dimensional field of velocities is the scalar 
function r such that the integral around any oriented region o in D of the 
product of r with the oriented area element is equal to the circulation of the 
field of velocities around the boundary of a: 


[ras= f » 
a 0a 


It is easy to compute an expression for the vorticity in terms of the stream 


function: 
r= —Ay. 


In the two-dimensional simply connected case, isovorticity of fields v, 
and v, means simply that the functions r, and r, (the vorticities of these 
fields) are carried to one another under a suitable volume-preserving dif- 
feomorphism. 

Under such conditions the two functions r, and r, have the same distribu- 
tion function, i.e., 


mes{x € D: r,(x) < c} = mes{x € D: r,(x) < c}, 


for any number c. Therefore, if two fields are in the image of the same orbit 
of the co-adjoint representation, then a whole series of functionals are equal; 
for example, the integrals of all powers of the vorticity 


[rias= | ras. 
D D 


In particular, Euler’s equations of motion of a two-dimensional ideal fluid 


0 
a + oVD = ~—grad p divv = 0, 


have an infinite collection of first integrals. For example, the integral of any 
power of the vorticity of the field of velocities 


is such a first integral. 

The existence of these first integrals (i.e., the relatively simple structure of 
orbits of the co-adjoint representation) allows us to prove theorems on 
existence and uniqueness, etc. in the two-dimensional hydrodynamics of an 
ideal (and also of a viscous) fluid; the complicated geometry of orbits of the 
co-adjoint representation in the three-dimensional case (or, perhaps, in- 
sufficient information about these orbits) makes the foundations of three- 
dimensional hydrodynamics a very hard problem. 
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J Stability of planar stationary flows 


Here we formulate general theorems about stationary rotations (Theorems 
7, 8, and 9 above) for the case of a group of diffeomorphisms. We obtain in 
this way the following assertions: 


1. A stationary flow of an ideal fluid is distinguished from all flows iso- 
vorticial to it by the fact that it is a conditional extremum (or critical point) 
of the kinetic energy. 

2. If (i) the indicated critical point is actually an extremum, 1.e., a local con- 
ditional maximum or minimum, (ii) it satisfies certain (generally satisfied) 
regularity conditions, and (iii) the extremum is non-degenerate (the 
second differential is positive- or negative-definite), then the stationary 
flow is stable (i-e., is a Liapunov stable equilibrium position of Euler’s 
equation). 

3. The formula for the second differential of the kinetic energy, on the tangent 
space to the manifold of fields which are isovorticial to a given one, has the 
following form in the two-dimensional case. Let D be a region in the 
euclidean plane with cartesian coordinates x and y. Consider a stationary 
flow with stream function w = W(x, y). Then 2d?H =ffp (dv)? + 
(Aw/VAW)(6r)* dx dy, where dv is the variation of the field of velocities 
(i.e., a vector of the tangent space indicated above), and dr = curl dv. 


We note that for a stationary flow, the gradient vectors of the stream 
function and its laplacian are collinear. Therefore the ratio Vy/VAW makes 
sense. Furthermore, in a neighborhood of every point where the gradient of 
the vorticity is not zero, the stream function is a function of the vorticity 
function. 

The assertions introduced above lead to the conclusion that the positive- 
or negative-definiteness of the quadratic form d*H is a sufficient condition 
for stability of the stationary flow under consideration. This conclusion does 
not formally follow from Theorems 7, 8, and 9 since the application of any of 
our formulas in the infinite-dimensional case requires justification. Fortu- 
nately, we can justify the final conclusion about stability without justifying 
the intermediate constructions. Thus we can rigorously prove the following 
a priori bounds (expressing the stability of a stationary flow in terms of small 
perturbations of the initial velocity field). 


Theorem 13. Suppose that the stream function of a stationary flow, W = W(x, y), 
ina region D is a function of the vorticity function (i.e., of the function Aw) not 
only locally, but globally. Suppose that the derivative of the stream function 
with respect to the vorticity satisfies the inequality 


V 
<< whereQ<c<C<o. 
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Let W + @(x, y, t) be the stream function of another flow, not necessarily 
stationary. Assume that, at the initial moment, the circulation of the velocity 
field of the perturbed flow (with flow function + @) around every boundary 
component of the region D is equal to the circulation of the original flow (with 
stream function W). Then the perturbation p = (x, y, t) at every moment 
of time is bounded in terms of the initial perturbation @y = ¢(x, y, 0) by the 
formula 


ff (Ve)? + c(Ag)? dx dy < ff (V@o)? + C(A@o)* dx dy. 
D D 


If the stationary flow satisfies the inequality 


Vv 
c<- Ty sc 0<c<C<o, 


then the perturbation ¢ is bounded in terms of Po by the formula 


[J cao ~ Woy dx dy < [J casooy — (V@o)? dx dy. 


This theorem implies the stability of a stationary flow in the case of a 
positive-definite quadratic form 


Vv 
[foro + ag (Aor dx ay 


with respect to V@ (where @ is a constant function on every component of the 
boundary of D whose gradient flow is zero over every boundary component), 
and also in the case of a negative definite form 


V 
[[ ovo + (max ad aor dx dy. 


EXAMPLE 1. Consider a planar parallel flow in the strip Y, < y < Y, in the 
(x, y)-plane with velocity profile v(y) (i.e., with velocity field (u(y), 0)). Such 
a flow is stationary for any velocity profile. To make the region of the flow 
compact, we impose the condition that the velocity fields of all flows under 
consideration be periodic with period X in the x-coordinate. 

The conditions of Theorem 13 are fulfilled if the velocity profile has no 
points of inflection (ie., if d?v/dy* # 0). We come to the conclusion that 
planar parallel flows of an ideal fluid with no inflection points in the velocity 
profile are stable. 


The analogous proposition in the linearized problem is called Rayleigh’s 
theorem. 
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We emphasize that in Theorem 13 it is not a question of stability “in a linear approxima- 
tion,” but of actual strict Liapunov stability (i.e., with respect to finite perturbations in the 
nonlinear problem). The difference between these two forms of stability is substantial in this 
case, since our problem has a hamiltonian character (cf. Theorem 4); for hamiltonian systems 
asymptotic stability is impossible, so stability in a linear approximation is always neutral and 
insufficient for a conclusion about the stability of an equilibrium position of the nonlinear 
problem. 


EXAMPLE 2. Consider the planar-parallel flow on the torus 


{(x, y), x mod X, y mod 2x} 


with velocity field v = (sin y, 0), parallel to the x-axis. This field is deter- 
mined by the stream function = —cos y and has vorticity r = —cos y. 
The velocity profile has two inflection points, but the stream function can 
be expressed as a function of the vorticity. The ratio Vw/VAw is equal to 
minus one. By applying Theorem 13 we can convince ourselves of the 
stability of our stationary flow in the case when 


2n X 2n xX 
i { (Ag)? dx dy > [ [ (Vo)? dx dy 
0 0 0 (0) 


for all functions ¢ of period X in x and 2z in y. It is easy to calculate that the 
last inequality is satisfied for X < 27 and violated for X > 2z. 

Thus Theorem 13 implies the stability of a sinusoidal stationary flow ona 
short torus, when the period in the direction of the basic flow (X) is less than 
the width of the flow (27). On the other hand, we can directly verify that on a 
long torus (for X > 27) our sinusoidal flow is unstable.?’ Thus, in this 
example, the sufficient condition for stability from Theorem 13 turns out to 
be necessary. 


We should note that in general an indefinite quadratic form d?H does not imply instability 
of the corresponding flow. In general, an equilibrium position of a hamiltonian system can be 
stable even though the hamiltonian function at this position is neither a maximum nor a mini- 
mum. The quadratic hamiltonian H = pj + q? — p3 — q3 is the simplest example of this kind. 


K Riemannian curvature of a group of diffeomorphisms 


The expression for the curvature of a Lie group provided with a one- 
sided-invariant metric, introduced in subsection E, makes sense also for the 
group SDiffD of diffeomorphisms of a riemannian domain D. This group is 
the configuration space for an ideal fluid filling the domain D. The kinetic 
energy defines a right-invariant metric on SDiffD. The number which we 
obtain by formally applying the formula for the curvature of a Lie group to 


°7 Cf, for example, the article of L. D. Meshalkin and Ya. G. Sinai, “Investigation of the stability 


of a stationary solution of a system of equations for the plane movement of an incompressible 
viscous liquid.” J. Applied Math. Mech. 25 (1962), 1700-1705. 
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this infinite-dimensional group is naturally called the curvature of the group 
SDiff D. 

Calculation of the curvature of a group of diffeomorphisms has been 
carried out completely only in the case of a flow on the two-dimensional 
torus with euclidean metric. Such a torus is obtained from the euclidean 
plane R? by identifying points whose difference lies in some lattice (a discrete 
subgroup of the plane). An example of such a lattice is the set of points with 
integral coordinates. In general, to obtain an arbitrary lattice T we may 
replace the square lying at the basis of this special lattice by any parallelogram. 

Now consider the Lie algebra of vector fields with divergence zero on the 
torus with a single-valued stream function. The corresponding group 
S, Diff T* consists of volume-preserving diffeomorphisms which leave the 
center of mass of the torus fixed. It is embedded in the group SDiff T? of all 
volume-preserving diffeomorphisms as a totally geodesic submanifold (ie., 
a submanifold such that each of its geodesics is a geodesic in the ambient 
manifold). 

The proof consists of the fact that if, at the initial moment, a velocity 
field of an ideal fluid has a single-valued stream function, then at all other 
moments of time the stream function will also be single-valued; this follows 
from the law of conservation of momentum. 

We will now investigate the curvature of the group Sy Diff T” in all pos- 
sible two-dimensional directions passing through the identity of the group 
(the curvature of the group SDiff T? in every such direction is the same, since 
the submanifold Sy Diff T? is totally geodesic). 

Choose an orientation on R?. Then elements of the Lie algebra of the 
group Sy Diff T* can be thought of as real functions on the torus having 
average value zero (a field with divergence zero is obtained from such a 
function by considering it to be a stream function). Therefore, a two-dimen- 
sional direction in the tangent space to the group S, Diff T” is determined by 
a pair of functions on the torus with average value zero. 

We will give such a function by the set of its Fourier coefficients. It is con- 
venient to carry out all calculations with Fourier series in the complex do- 
main. We let e, (where k, called a wave vector, is a point of the euclidean 
plane) denote the function whose value at a point x of our plane is equal to 
e) Such a function determines a function on the torus if it is I-periodic, 
i.e., if adding a vector from the lattice I to x does not change the value of the 
function. 

In other words, the scalar product (k, x) must be a multiple of 27 for all 
x €T. All such vectors k belong to a lattice ['* on R?. The functions e,, where 
k eT *, form a complete system in the space of complex functions on the torus. 

We now complexify our Lie algebra, scalar product < , >, commutator 
[ , ] and operation B in the algebra, as well as the riemannian connection 
and curvature tensor Q, so that all these functions become (multi-) linear in 
the complex vector space of the complexified Lie algebra. The functions e, 
(where k € T'*, k # 0) form a basis of this vector space. 
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Theorem 14. The explicit formulas for the scalar product, commutator, opera- 
tion B, connection, and curvature of a right-invariant metric on the group 
So Diff T? have the following form: 


e,@> =90 fork +140, 
ey, C-4> = KS; 
Lex, @] = (KA Degsi; 


k2 
Bex, €1) = by, 1x41, Where by, = (k A D a 
Vener = dix+1@e+1, where dy,» = Game, 


v 


Ryimn=Oifk+1l+m+n#0;ifk+1+m+n=0, then Ryimn = 
(in Gem ~ im Akn)S, where Aw = (u A v)?/|u + vi. 


In these formulas, S$ is the area of the torus, and u A v the area of the 
parallelogram spanned by u and v (with respect to the chosen orientation of 
R?). The parentheses denote the euclidean scalar product in the plane, and 
angled brackets denote the scalar product in the Lie algebra. 

The proof of this theorem is in the first article listed in the introduction to 
this appendix. 

The formulas above allow us to calculate the curvature in any two- 
dimensional direction. These calculations show that in most directions the 
curvature is negative, but in a few it is positive. Consider, for instance, some 
fluid flow, i.e. a geodesic of our group. By Jacobi’s equations, the stability of 
this geodesic is determined by the curvatures in the directions of all possible 
two-dimensional planes passing through the velocity vector of the geodesic 
at each of its points. 

Assume now that the flow under consideration is stationary. Then the geo- 
desic is a one-parameter subgroup of our group. From this it follows that the 
curvatures in the directions of all planes passing through velocity vectors of 
the geodesic at all of its points are equal to the curvatures in the corresponding 
planes going through the velocity vector of this geodesic at the initial moment 
of time (Proof: right translate to the identity element of the group). Thus the 
stability of a stationary flow depends only on the curvatures in the directions 
of those two-dimensional planes in the Lie algebra which contain the vector 
of the Lie algebra which is the velocity field of the stationary flow. 

Consider, for example, the simplest parallel sinusoidal stationary flow. 
Such a flow is given by the stream function 

&, + e_, 
c= os 
Consider any other real vector of the algebra, 7 = )' x,e, (so x_, = X,). We 
deduce easily from Theorem 14 that 
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Theorem 15. The curvature of the group So Diff T? in any two-dimensional 
plane containing the direction € is non-positive. Namely, 


S 
<QE, ny = — 4 y AR lX1 + X14 24! 
i 


From this formula it follows, in particular, that 


1. The curvature is equal to zero only for those two-dimensional planes which 
consist of parallel flows in the same direction as €, so that [€, n] = 0; 

2. The curvature in the plane defined by the flow functions € = cos kx, 
n = cos Ix is 

ae ae 
K ae ae a sin’ B, 
where S is the area of the torus, « is the angle between k and I, and f is the 
angle between k + land k — 1; 

3. In particular, the curvature of the group of diffeomorphisms of the torus 
{(x, y) mod 2z} in directions determined by the velocity fields (sin y, 0) 
(0, sin x) is equal to 

ae -1 
~ 8?" 


L Discussion 


It is natural to expect that the curvature of a group of diffeomorphisms is 
related to the stability of geodesics in this group (ie. to the stability of flows 
of an ideal fluid) in the same way as the curvature of a finite-dimensional Lie 
group is related to the stability of geodesics on it. Namely, negative curvature 
causes exponential instability of geodesics. The characteristic path length 
(the average path length in which errors in the initial conditions grow e 
times) has order of magnitude 1/./—K. Thus, knowing the curvatures of a 
group of diffeomorphisms allows us to estimate the time for which we can 
predict the development of the flow of an ideal fluid by means of an approxi- 
mate initial velocity field before the error grows to a large order. 


It should be emphasized that instability of a flow of an ideal fluid is here understood dif- 
ferently than in section K; it is a question of exponential instability of the motion of the fluid, 
not of its velocity field. It is possible for a stationary flow to be a Liapunov stable solution of 
Euler’s equation while the corresponding motion of the fluid is exponentially unstable. The 
reason is that a small change in the velocity field of a fluid can induce an exponentially growing 
change in the motion of the fluid. In such a case (stability of the solution of Euler’s equation 
and negative curvature of the group) we can predict the velocity field, but we cannot predict 
the motion of the fluid mass without a great loss of accuracy. 


The formulas mentioned above for curvature can be used even for rough 
estimates of the time over which a long-term dynamical prediction of the 


340 


Appendix 2: Geodesics of left-invariant metrics on Lie groups 


weather is impossible, if we agree to a few simplifying assumptions. These 
simplifying assumptions consist of the following: 


1. The earth has the shape of a torus obtained by factoring the plane by a 
square lattice. 

2. The atmosphere is a two-dimensional homogeneous incompressible 
inviscid fluid. 

3. The motion of the atmosphere is approximately a “tradewind current,” 
parallel to the equator of the torus and having sinusoidal velocity profile. 


To calculate the characteristic path length we must then estimate the 
curvature of the group S,DiffT’ in directions containing the “tradewind 
current” é from Theorem 15. To do this we will look at T? as {(x, y) mod 27}, 
k = (0, 1). In other words, we look at 27-periodic flows on the (x, y)-plane 
close to a stationary flow, parallel to the x-axis and with sinusoidal velocity 


profile 
v = (sin y, 0). 


It is easy to see from the formula in Theorem 15 that the curvature of the 
group S,)DiffT? in the planes containing our tradewind current v varies 
within the limits 


2 
ae < K <0, where S = 4n? is the area of the torus. 


Here the lower limit is obtained by a rather crude estimate. However, a 
direction with curvature K = —1/2S certainty exists, and there are many 
other directions with curvature of approximately the same size. In order to 
make a rough estimate of the characteristic path length, we make the rough 
guess Ky = —1/2S as value of the “mean curvature.” 

If we agree to start from this value Ky of the curvature, we obtain the 
characteristic path length 


$=G/=Ko) Saas: 


The velocity of motion with respect to the group which corresponds to 
our tradewind current is equal to JS/2 (since the average square value of 
the sine is 3). Therefore, the time it takes for our flow to travel the characteristic 
path length is equal to 2. The fastest particles of the fluid go a distance of 2 after 
this time, i.e., 1/n of the entire orbit around the torus. 

Thus, if we take our value of the mean curvature, then the error grows by 
e” = 20 after the time of one orbit of the fastest particle. Taking the value 
100 km/hr as the maximal velocity of the tradewind current, we get 400 hours 
for the time of orbit, i.e., less than three weeks. 

Thus, if at the initial moment the state of the weather was known with 
small error ¢, then the order of magnitude of the error of prediction after n 


months would be 
30 - 24 
10*"c, where k = 00 logi9e = 2.5. 
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For example, to predict the weather two months in advance we must have 
initial data with five more digits of accuracy than the prediction accuracy. 
Practically, this means that calculating the weather for such a period is 


impossible. 
It is clear that the estimates mentioned here are not very sharp, and the 


model we took is very simplified. The choice of the value of “mean curvature” 
also requires justification. 
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The symplectic manifolds of classical mechanics are most often phase spaces 
of lagrangian mechanical systems, i.e., cotangent bundles of configuration 
spaces. 

An entirely different series of symplectic manifolds arises in algebraic 
geometry. 

For example, any smooth complex algebraic manifold (given by a system 
of polynomial equations in complex projective space) has a natural symplectic 
structure. 

The construction of a symplectic structure on an algebraic manifold is 
based on the fact that complex projective space itself has a particular sym- 
plectic structure, namely the imaginary part of its hermitian structure. 


A The hermitian structure of complex 
projective space 


Recall that n-dimensional complex projective space CP" is the manifold of all 
complex lines passing through the point 0 in an (n + 1)-dimensional com- 
plex vector space C”*!. To construct a symplectic structure on CP” we use 
the hermitian structure in the corresponding vector space C"*'. 


Recall that a hermitian scalar product (or hermitian structure) on a complex vector space 
is a complex linear function on pairs of vectors, which (1) is linear in the first and anti-linear 
in the second variable, (2) changes its value to the complex conjugate when the arguments are 
interchanged, and (3) becomes a positive-definite real quadratic form if we take the arguments 
equal: 

AEny=AKEN> Xn > = End KEE > 0 
for € #0. 

An example of a hermitian scalar product is 
() Cm) = Sites 
where ¢, and n, are the coordinates of the vectors € and 4 in some basis. 

A basis for which a hermitian scalar product has the form (1) always exists, and is called a 


hermitian-orthonormal basis. 
The real and imaginary parts of a hermitian scalar product are real bilinear forms. The 


first is symmetric, and the second skew-symmetric, and both are nondegenerate: 


<m=(En +n) (n=(59 [an] = -[n¢]. 


The quadratic form (6, ¢) is positive-definite. 

Thus a hermitian structure < , > on a complex vector space gives it a euclidean structure 
(, )andasymplectic structure [ , ]. These two structures are related to the complex structure 
by the relation 


(c,] = (é, in). 


We will now define a riemannian metric on complex projective space. 
To do this, consider the unit sphere 
Se hea el Ge) = 1} 


in the corresponding vector space C"*!. This sphere inherits the riemannian 
metric from C"*!, Every complex line intersects our sphere in a great circle. 
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Definition. The distance between two points of complex projective space is 
the distance between the two corresponding circles on the unit sphere. 


We note that these two circles are parallel in the sense that the distance 
from any point of one of the circles to the other is the same (Proof: multiplica- 
tion of z by e’” preserves the metric on the sphere). This circumstance allows 
us at once to write down an explicit formula (2) for the riemannian metric on 
the complex projective space given by the construction defined above. 

In fact, let p denote the mapping 


p:C"*1\0 > CP", 


taking a point z 4 0 of the vector space C"*! to the complex line passing 
through 0 and z. 

Every vector ¢ tangent to CP” at the point pz can be represented (in many 
ways) as the image of a vector at the point z; under this map 


C=p,o ¢eTCp*!. 


Theorem. The square of the length of a vector € in the riemannian metric 
defined above is given by the formula 


Q) ds2(() =$ PLES ASP? 
Z,z> 


Proor. Assume first that the point z lies on the unit sphere S?"*". 

Decompose the vector € into two components: one in the complex line determined by the 
vector z and the other in the hermitian-orthogonal direction. Note that hermitian-orthogonal 
to the vector z means euclidean-orthogonal to the vectors z and iz. The vector z is a euclidean 
normal vector to the sphere S?"*! at z. The vector iz is a vector tangent to the circle in which 
the sphere intersects the complex line passing through z. Thus the component n of the vector € 
which is hermitian-orthogonal to z is tangent to the sphere S?"*! and euclidean-orthogonal 
to the circle in which the sphere intersects the line pz. 

By the definition of the metric on CP", the riemannian square of the length of the vector ¢ 
is equal to the euclidean square length of the component y of € which is hermitian-orthogonal 
to z. 

We calculate the component of ¢, hermitian-orthogonal to z. We write our decomposition as 


€=cz +, where (y,z> = 0. 

By hermitian multiplication with z, we find 

(2) = 2,2), 
so 

_ S228 — 5,202 

<2, Z) 

Calculating the hermitian square of the vector ny, we find <y,7> = <n, ¢> and 
(2,2) <6, €> — (2, 2><z, o> 


<n Cad 
Thus, formula (2) is proved for points z of the unit sphere. The general case follows from looking 
at the homothetic transformation z — 2/|z|. O 
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Note that our construction allows us to define not only a euclidean 
structure (2), but also a hermitian structure on the tangent space to CP". 
Consider the hermitian-orthogonal complement H to the direction of the 
vector z in the space TC?**, where z € S*"*!. The map p,: H > T(CP"),, 
maps H isomorphically (as we showed above) onto the tangent space to CP" 
and carries over the hermitian structure from H. 

It is clear that the scalar square defined by this hermitian structure is given 
by formula (2). Therefore, the formula for the hermitian scalar product in 
the tangent space to CP" can be written down without further calculations: 


(3) Cte ee oS 2><z, &2) 


for any vectors ¢,, ¢, in TC;*" satisfying the relation p,¢, = ¢, € T(CP")y2- 
We note that in formula (3) the point z does not necessarily lie on the unit 
sphere. 

The euclidean and hermitian structures (2) and (3) constructed on the 
tangent spaces to CP” are not invariant under all projective transformations 
of the manifold CP", but are invariant under those which are given by unitary 
(preserving the hermitian structure) linear transformations of the vector 
space C"*?, 


B The symplectic structure of complex 
projective space 


We consider the imaginary part of the hermitian form (3), taken with co- 
efficient — I/m (the reason for taking this coefficient is explained in Problem 1, 
Section C): 


a Ml, f2) = — 2 Ik Es, b>. 


Like the imaginary part of any hermitian form, the real bilinear form Q on 
the tangent space to complex projective space is skew-symmetric and non- 
degenerate. 


Theorem. The differential 2-form Q gives a symplectic structure on complex 
projective space. 


Proor. We need only verify that the form Q is closed. 

Consider the exterior derivative dQ of the form Q. This differential 3-form on CP" is invariant 
with respect to mappings induced by unitary transformations of the space C"*!. It follows from 
this that it is equal to zero. 

To see this, we look at a hermitian-orthonormal basis e,,...,e, of the tangent space to 
CP" at some point z. Then the vectors e,, ..., ,, ie;,..., ie, form a euclidean-orthonormal 
R-basis. We will show that the value of the form dQ on any triple of these R-basis vectors is 
equal to zero. (We assume that n > |; for n = 1 there is nothing to prove.) 

Note that in any triple of R-basis vectors at least one is hermitian-orthogonal to the two 
others. Denote this vector by e. It is easy to construct a unitary transformation of the space C"*! 
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inducing a motion on CP" which fixes the point z and the hermitian-orthogonal complement 
to e, and changes the direction of e. 

The value of the form dQ on our three vectors, e, f, and g is equal to its value on the triple 
—e, f, and g by the invariance of the form Q, and is hence equal to zero. O 


Remark. Another method of constructing the same symplectic structure 
on complex projective space consists of the following. Consider small oscil- 
lations of a mathematical pendulum with an (n + 1)-dimensional configura- 
tion space. We make use of the integral of energy to decrease by 1 the degree of 
freedom of the system. The phase space obtained after this operation is CP”, 
and the symplectic structure on it agrees with the form Q described above up 
to a factor. 


One other method of constructing a symplectic structure on CP" uses the fact that this 
space may be represented as one of the orbits of the co-adjoint representation of a Lie group, 
and on every such orbit there is always a standard symplectic structure (cf. Appendix 2, Sec- 
tion A). For the Lie group we can take the group of unitary (preserving the hermitian metric) 
operators in an (n + 1)-dimensional complex space. The orbits of the co-adjoint representation 
in this case are the same as of the adjoint representation. In the adjoint representation the operator 
of reflection through a hyperplane (which changes the sign of the first coordinate and leaves 
the others fixed) has CP" as its orbit, since the reflection operator is uniquely determined by 
the complex line orthogonal to the hyperplane. 


C Symplectic structure on algebraic manifolds 


We will now obtain a symplectic structure on any complex submanifold M 
of complex projective space. Let j: M — CP" be an embedding of the complex 
manifold M into complex projective space. The riemannian, hermitian, and 
symplectic structures on projective space induce corresponding structures on 
M. For example, the symplectic structure on M is given by the formula 


Qu = j*O. 


Theorem. The differential form Qy gives a symplectic structure on the manifold 
M. 


Proor. The nondegeneracy of the 2-form Qy, follows from the fact that M 
is a complex submanifold. In fact, the quadratic form 


(¢, 6) = Qué, ig) 


is positive-definite (it is induced by the riemannian metric on CP"). Therefore, 
the bilinear form (€, 7) = Qy(é, in) is nondegenerate. This means that the 
form Qy, is also nondegenerate. The form Q,y is closed since the form Q is 
closed. O 


Remark. In the same way as for complex projective space, we define a 
hermitian structure on the tangent spaces of its complex submanifolds; the 
symplectic structure is the imaginary part. 
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A complex manifold with a hermitian metric whose imaginary part is a 
closed form (i.e. a symplectic structure) is called a Kahler manifold and its 
hermitian metric a Kahler metric. Many important results have been 
obtained in the geometry of Kahler manifolds; in particular, they have 
remarkable topological properties (cf., for example, A. Weil, “Variétés 
Kahlériennes,” Hermann, 1958). 

Not all symplectic manifolds admit a Kahler structure. 


PROBLEM 1. Calculate the symplectic structure Q in the affine chart w = z,: zo of the projective 
line CP}. 


ANSWER. Q = (1/n)(dx A dy)/(1 + x? + y?)?, where w= x+iy. The coefficient in the de- 
finition of the form Q is chosen to obtain the usual orientation of the complex line (dx a dy) 
and so that the integral of the form Q along the whole projective line is equal to {. 


PROBLEM 2. Show that the symplectic structure Q in the affine chart w, = 2,29 \(k = 1,...,n) 


i Yock<rcn( dw, — w,dw,)(w, dw, — w,dw,) 


Q 
2n (OG =0 (M% WJ)? 


By convention, wo = 1. 
Remark. Differential forms on a complex space with complex values (such as dw, and dw,) 
are defined as complex linear functions of tangent vectors; if w, = x, + iy,, then 


dw, = dx, + idy, dw, = dx, — idy,. 


The space of such forms in C” has complex dimension 2n; the 2n forms dw,, dw, (k = 1,...,n), 
for example, form a C-basis, or the 2n forms dx,, dy,. 
Exterior multiplication is defined in the usual way and obeys the usual rules. For example, 


dw a dw = (dx + idy) a (dx — idy) = —2idx a dy. 


Let f be a real-smooth function on C” (with complex values, in general). An example of 
such a function is |w|? = ) w,#,. The differential of the function f is a complex 1-form. There- 
fore, it can be decomposed in the basis dw,, di,. The coefficients of this decomposition are 
called the partial derivatives “with respect to w,” and “with respect to W,”: 


éf = Laws Law. 
Ow 


ow 


In calculating exterior derivatives it is also convenient to separate into differentiation d’ 
with respect to the variable w and d” with respect to the variable W, so that d = d’ + a”. 
For example, for a function f 


i a! 
ape aw df= di. 
For the differential 1-form 


o= Ya dw, + b dw,, 
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the operators d’ and d” are defined analogously: 
do= ¥ d'a, A dw, + d'b, A dw, 
d’a => d’a, \ dw, + d"by A div. 
PROBLEM 3. Show that the symplectic structure Q on the affine chart (w, = 2,29 ') of the projective 


space CP” is given by the formula 


i oe . 
Q = —d'd" in Y [wy 


k=0 
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An odd-dimensional manifold cannot admit a symplectic structure. The 
analogue of a symplectic structure for odd-dimensional manifolds is a little 
less symmetric, but also a very interesting structure—the contact structure. 

The source of symplectic structures in mechanics are phase spaces (i.e., 
cotangent bundles to configuration manifolds), on which there is always a 
canonical symplectic structure. The source of contact structures are mani- 
folds of contact elements of configuration spaces. 

A contact element to an n-dimensional smooth manifold at some point is 
an (n — 1)-dimensional plane tangent to the manifold at that point (i.e., an 
(n — 1)-dimensional subspace of the n-dimensional tangent space at that 
point). 

The set of all contact elements of an n-dimensional manifold has a natural 
smooth manifold structure of dimension 2n — 1. It turns out that there is an 
interesting additional “contact structure” on this odd-dimensional manifold 
(we describe this below). 

The manifold of contact elements of a riemannian n-dimensional manifold 
is closely related to the (2n — 1)-dimensional manifold of unit tangent vectors 
of this riemannian n-dimensional manifold, or to the (2n — 1)-dimensional 
energy level manifold of a point mass moving on the riemannian manifold 
under inertia. The contact structures on these (2n — 1)-dimensional mani- 
folds are closely related to the symplectic structure on the 2n-dimensional 
phase space of the point (i.e., the cotangent bundle of the original n-dimen- 
sional riemannian manifold). 


A Definition of contact structure 


Definition. A contact structure on a manifold is a smooth field of tangent 
hyperplanes’? satisfying a nondegeneracy condition which will be formu- 
lated later. 


To formulate this condition we examine what a field of hyperplanes looks 
like in general in a neighborhood of a point in an N-dimensional manifold. 


EXAMPLE. Let N = 2. Then the manifold is a surface and a field of hyper- 
planes is a field of straight lines. Such a field in a neighborhood of a point is 
always constructed very simply, namely, as a field of tangents to a family 
of parallel lines in a plane. More precisely, one of the basic results of the local 
theory of ordinary differential equations is that it is possible to change any 
smooth field of tangent lines on a manifold into a field of tangents to a family 
of straight lines in euclidean space by using a diffeomorphism in a sufficiently 
small neighborhood of any point of the manifold. 

If N > 2, then a hyperplane is not a line, and the question becomes 
significantly more complicated. For example, most fields of two-dimensional 


°8 A hyperplane in a vector space is a subspace of dimension | less than the dimension of the 


space (i.e., the zero level set of a linear function which is not identically zero). A tangent hyper- 
plane is a hyperplane in a tangent space. 
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tangent planes in ordinary three-dimensional space cannot be diffeo- 
morphically mapped onto a field of parallel planes. The reason is that there 
exist fields of tangent planes for which it is impossible to find “integral sur- 
faces,” i.e., surfaces which have the prescribed tangent plane at each point. 


The nondegeneracy condition for a field of hyperplanes which enters into 
the definition of contact structure consists of the stipulation that the field of 
hyperplanes must be maximally far from a field of tangents to a family of 
hyperplanes. In order to measure this distance, as well as to convince our- 
selves of the existence of fields without integral hypersurfaces, we must make 
a few constructions and calculations.°? 


B Frobenius’ integrability condition 


We will consider some point on an N-dimensional manifold and try to 
construct a surface passing through this point and tangent to a given field 
of (N — 1)-dimensional planes at each point (an integral surface). 

To this end we introduce a coordinate system onto a neighborhood of 
this point so that at the point itself one coordinate surface is tangent to a 
plane of the field. We will call this plane the horizontal plane, and will call 
the coordinate axis not lying in it the vertical axis. 

Construction of an integral surface. An integral surface, if one exists, is the 
graph of a function of N — 1 variables near the origin. To construct it, we 
can take some smooth path on the horizontal plane. Then the vertical lines 
over this path form a two-dimensional surface (cylinder); our field of planes 
intersects its tangent planes in a field of tangent lines. The integral surface 
we are looking for, if it exists, intersects this cylinder in an integral curve of the 
field of lines, starting at the origin. Such an integral curve always exists 
independent of whether an integral surface exists. Thus we can construct an 
integral surface over the horizontal plane by moving along smooth curves in 
the latter. 

In order to obtain a smooth integral surface from all the integral curves 
we need the result of our construction to be independent of the path, deter- 
mined only by its endpoint. In particular, for a circuit of a closed path in a 
neighborhood of the origin in the horizontal plane, the integral curve on the 
cylinder must close up. 

It is easy to construct examples of fields of planes for which such closure 
does not take place and, therefore, for which an integral surface does not 
exist. Such fields of planes are called nonintegrable. 

Example of a nonintegrable field of planes. In order to give a field of planes 


and measure numerically the deviation from closure, we introduce the follow- 
ing notation. We note first of all that a field of hyperplanes can be given locally 
by a differential 1-form; a plane in the tangent space gives a 1-form up to 


°° From now on, we will omit the prefix “hyper-”. If we wish, we may assume that we are in 
three-dimensional space and a hypersurface is an ordinary surface. The higher-dimensional 
case is analogous to the three-dimensional case. 
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multiplication by a nonzero constant. We will choose this constant so that 
the value of the form on the vertical basic vector is equal to 1. 

This condition can be satisfied in some neighborhood of the origin since 
the plane of the field at zero does not contain the vertical direction. This 
condition determines the form uniquely (given the field of planes). 

A field of planes in ordinary three-space which does not have an integral 
surface can be given, for example, by the 1-form 


@=xdy+dz, 


where x and y are the horizontal coordinates and z is the vertical. The proof 
of the fact that this field of planes is nonintegrable will be given below. 

Construction of a 2-form measuring nonintegrability. With the help of the 
form giving the field, we can measure the degree of nonintegrability. This is 
done using the following construction (Figure 236). 


Figure 236 Integral curves constructed for a non-integrable field of planes 


Consider a pair of vectors emanating from the origin and lying in the 
horizontal plane of our coordinate system. Construct a parallelogram on 
them. We obtain two paths from the origin to the opposite vertex. Over each 
of these two paths we can construct an integral curve (with two sections) as 
described above. As a result, in general, there arise two different points over 
the vertex of the parallelogram opposite to the origin. The difference in the 
heights of these points is a function of our pair of vectors. This function is 
skew-symmetric and equal to zero if one of the vectors is equal to zero. Thus 
the linear part of the Taylor series of this function is zero at zero, and the 
quadratic part of its Taylor series is a bilinear skew-symmetric form on the 
horizontal plane. 

If the field is integrable, then this 2-form is equal to zero. Therefore, this 
2-form can be considered as a measure of the nonintegrability of the field. 

The 2-form is well defined. We constructed the 2-form above with the help 
of coordinates. However, the value of our 2-form on a pair of tangent vectors 
does not depend on the coordinate system, but only on the 1-form used to 
give the field. 

To convince ourselves of this, it is enough to prove the following. 


Theorem. The 2-form defined above agrees with the exterior derivative of the 
1-form @, da|=9, on the null space of w. 
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Proor. We will show that the difference in the heights of the two points obtained as a result 
of our two motions along the sides of the parallelogram is the same as the integral of the 1-form w 
over the four sides of the parallelogram, up to a quantity small of third order with respect to 
the sides of the parallelogram. 

To this end we note that the height of the rise of an integral curve along any path of length ¢ 
emanating from the origin has order «7, since at the origin the plane of the field is horizontal. 
Therefore, the integrals of the 2-form dw over all four vertical areas over the sides of the paral- 
lelogram bounded by the integral curves and the horizontal plane, have order ¢° if the sides 
are of order ¢. 

The integrals of the form @ along integral curves are exactly equal to zero. Therefore, by 
Stokes’ formula, the increase in height along the integral curve lying over any of the sides of the 
parallelogram is equal to the integral of the 1-form w along this side up to a quantity of third- 
order smallness. 

Now the theorem follows directly from the definition of exterior differentiation. 


Some arbitrariness remains in the choice of the 1-form which we used to 
construct our 2-form. Namely, the form w is defined by the field of planes 
only up to multiplication by a function f which is never zero. In other words, 
we could have started with the form fw. Then we would have obtained the 
2-form 


dfo = fdw + df a, 


which, on our plane, differs from the 2-form dw by multiplication by the 
nonzero number f(0). 

Thus the 2-form constructed on the plane of the field is defined invariantly 
up to multiplication by a nonzero constant. 


Condition for integrability of a field of planes 


Theorem. If a field of hyperplanes is integrable, then the 2-form constructed 
above on a plane of the field is equal to zero. Conversely, if the 2-form con- 
structed on every plane of the field is equal to zero, then the field is integrable. 


ProorF. The first assertion of the theorem is clear by the construction of the 2-form. The proof 
of the second assertion can be carried out by exactly the same reasoning we used to prove the 
commutativity of phase flows for which the Poisson bracket of the velocity fields was equal to 
zero. We can simply refer to this commutativity, applying it to the integral curves arising over 
the lines of the coordinate directions in the horizontal plane. 


Theorem. The integrability condition for a field of planes, 
do=0 for wo=0 
is equivalent to the following condition of Frobenius: 
oA do=0. 
Proor. We consider the value of the 3-form above on any three distinct coordinate vectors. 


Only one of these vectors can be the vertical. Therefore, of all the terms entering into the defini- 
tion of the value of the exterior product of the three vectors, only one is nonzero: the product of 
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the value of the form w on the vertical vector with the value of the form dw on the pair of 
horizontal vectors. If the field given by the form is integrable, then the second factor is zero, 
so our 3-form is zero on arbitrary triples of vectors. 

Conversely, if the 3-form is equal to zero for any vectors, then it is equal to zero for any 
triple of coordinate vectors, of which one is vertical and the other two horizontal. The value 
of the 3-form on such a triple is equal to the product of the value of w on the vertical vector 
with the value of dw on the pair of horizontal vectors. The first factor is not zero, so the second 
must be zero, and thus the form dw is zero on a plane of the field. O 


C Nondegenerate fields of hyperplanes 


Definition. A field of hyperplanes is said to be nondegenerate at a point if the 
rank of the 2-form da|,,-9 in the plane of the field passing through this 
point is equal to the dimension of the plane. 


This means that for any nonzero vector in our plane, we can find another 
vector in the plane such that the value of the 2-form on this pair of vectors 
is not zero. 


Definition. A field of planes is called nondegenerate on a manifold if it is non- 
degenerate at every point of the manifold. 


Note that on an even-dimensional manifold there cannot be a nondegen- 
erate field of hyperplanes; on such a manifold a hyperplane is odd-dimen- 
sional, and the rank of every skew-symmetric bilinear form on an 
odd-dimensional space is less than the dimension of the space (cf. Section 44). 

Nondegenerate fields of hyperplanes do exist on odd-dimensional mani- 
folds. 


EXAMPLE. Consider a euclidean space of dimension 2m + 1 with coordinates 
x, y, and z (where x and y are vectors in an m-dimensional space and z is a 
number). The 1-form 


o=xdy+dz 


defines a field of hyperplanes. The plane of the field passing through the origin 
has equation dz = 0. We take x and y as coordinates in this hyperplane. 
Therefore, in this plane of the field our 2-form can be written in the form 


do|,=9 = dx A dy = dx, a dy, t-++ + dXm A Vm: 


The rank of this form is 2m, so our field is nondegenerate at the origin, and 
thus also in a neighborhood of the origin (in fact, this field of planes is 
nondegenerate at all points of the space). 


Now, finally, we can give the definition of a contact structure on a mani- 
fold: a contact structure on a manifold is a nondegenerate field of tangent 
hyperplanes. 
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D The manifold of contact elements 


The term “contact structure” stems from the fact that there is always such a 
structure on a manifold of contact elements of a smooth n-manifold. 


Definition. A hyperplane (dimension n — 1) tangent to a manifold at some 
point is called a contact element, and this point the point of contact. 


The set of all contact elements of an n-dimensional manifold has the struc- 
ture of a smooth manifold of dimension 2n — 1. 


In fact, the set of contact elements with a fixed point of contact is the set of all (n — 1)-dimen- 
sional subspaces of an n-dimensional vector space, i.e., a projective space of dimension n — 1. 
To give a contact element we must therefore give the n coordinates of the point of contact 
together with the n — 1 coordinates defining a point of an (n — 1)-dimensional projective 
space-—2n — | coordinates in all. 


The manifold of all contact elements of an n-dimensional manifold is a 
fiber bundle whose base is our manifold and whose fiber is (n — 1)-dimen- 
sional projective space. 


Theorem. The bundle of contact elements is the projectivization of the cotangent 
bundle: it can be obtained from the cotangent bundle by changing every 
cotangent n-dimensional vector space into an (n — 1)-dimensional pro- 
jective space (a point of which is a line passing through the origin in the 
cotangent space). 


Proor. A contact element is given by a |-form on the tangent space, for which this element is 
a zero level set. This form is not zero, and it is determined up to multiplication by a nonzero 
number. But a form on the tangent space is a vector of the cotangent space. Therefore, a 
nonzero form on the tangent space, determined up to a multiplication by a nonzero number, 
is a nonzero vector of the cotangent space, determined up to a multiplication by a nonzero 
number, i.e., a point of the projectivized cotangent space. O 


The contact structure on the manifold of contact elements. In the tangent 
space to the manifold of contact elements there is a distinguished hyperplane. 
It is called the contact hyperplane and is defined in the following way. 

We fix a point of the (2n — 1)-dimensional manifold of contact elements 
on an n-dimensional manifold. We can think of this point as an (n — 1)- 
dimensional plane tangent to the original n-dimensional manifold. 


Definition. A tangent vector to the manifold of contact elements at a fixed 
point belongs to the contact hyperplane if its projection onto the n- 
dimensional manifold lies in the (n — 1)-dimensional plane which is the 
given point of the manifold of contact elements. 
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In other words, a displacement of a contact element is tangent to the 
contact hyperplane if the velocity of the point of contact belongs to this 
contact element, no matter how the element turns. 


EXAMPLE. We take some submanifold of our n-dimensional manifold and 
consider all (n — 1)-dimensional planes tangent to it (.e., contact elements). 
The set of all such contact elements forms a smooth submanifold of the 
(2n — 1)-dimensional manifold of all contact elements. The dimension of 
this submanifold is equal to n — 1, no matter what the dimension of the 
original submanifold (which could be (n — 1)-dimensional, or have smaller 
dimension, down to a curve or even a point). 

This (n — 1)-dimensional submanifold of the (2n — 1)-dimensional 
manifold of all contact elements is tangent at each of its points to the field of 
contact hyperplanes (by the definition of contact hyperplane). Thus the 
field of (2n — 2)-dimensional contact hyperplanes has an (n — 1)-dimensional 
integral manifold. 


PROBLEM. Does this field of planes have integral manifolds of higher dimensions? 
ANSWER. No. 


PROBLEM. Is it possible to give the field of contact hyperplanes by a differential 1-form on the 
manifold of all contact elements? 


ANSwer. No, even if the underlying n-dimensional manifold is a euclidean space (for example, 
the ordinary two-plane). 


We will show below that the field of contact hyperplanes on the (2n — 1)- 
dimensional manifold of all contact elements of an n-dimensional manifold is 
nondegenerate. The proof uses the symplectic structure of the cotangent 
bundle. The manifold of contact elements is related by a simple construction 
to the space of the cotangent bundle (the projectivization of which is the 
manifold of contact elements). Moreover, the nondegeneracy of the field of 
contact planes of the projectivized bundle is closely related to the non- 
degeneracy of the 2-form giving the symplectic structure of the cotangent 
bundle. 

The construction we are concerned with will be carried out below in a 
somewhat more general situation. Namely, for any odd-dimensional mani- 
fold with a contact structure we can construct its “symplectification”—a 
symplectic manifold whose dimension is one larger. The inter-relation be- 
tween these two manifolds—the odd-dimensional contact manifold and the 
even-dimensional symplectic manifold—is the same as between the manifold 
of contact elements with its contact structure and the cotangent bundle with 
its symplectic structure. 


355 


Appendix 4: Contact structures 


E Symplectification of a contact manifold 


Consider an arbitrary contact manifold, i.e., a manifold of odd dimension N 
with a nondegenerate field of tangent hyperplanes (of even dimension N — 1). 
We will call these planes contact planes. Every contact plane is tangent to 
the contact manifold at one point. We will call this point the point of contact. 


Definition. A contact form is a linear form on the tangent space at the point of 
contact of the manifold such that its zero set is the contact plane. 


It should be emphasized that the contact form is not a differential form 
but an algebraic linear form on one tangent space. 


Definition. The symplectification of a contact manifold is the set of all contact 
forms on the contact manifold, provided with the structure of a sym- 
plectic manifold as defined below. 


We note first of all that the set of all contact forms on a contact manifold 
has a natural structure of a smooth manifold of even dimension N + 1. 
Namely, we can consider the set of all contact forms as the space of a bundle 
over the original contact manifold. Projection onto the base is the mapping 
associating the contact form to the point of contact. 

The fiber of this bundle is the set of contact forms with a common point of 
contact. All such forms are obtained from one another by multiplication by a 
nonzero number (so that they determine the same contact plane). Thus the 
fiber of our bundle is one-dimensional: it is the line minus a point. 

We also note that the group of nonzero real numbers acts on the manifold 
of all contact forms by the operation of multiplication, i.e., the product of a 
contact form and a nonzero number is again a contact form. In this way the 
group acts on our bundle, leaving every fiber fixed (upon multiplication of a 
form by a number the point of contact is not changed). 

Remark. So far we have not used the nondegeneracy of the field of planes. 
Nondegeneracy is needed only to insure that the manifold obtained by 
symplectification is symplectic. 


EXAMPLE. Consider the manifold (of dimension 2n — 1) ofall contact elements 
of an n-dimensional smooth manifold. On the manifold of elements there is a 
field of hyperplanes (which we defined above and called the contact hyper- 
planes). Therefore, we can symplectify the manifold of contact elements. 

As a result of symplectification we obtain a 2n-dimensional manifold. 
This manifold is the space of the cotangent bundle of the original n-dimen- 
sional manifold without zero vectors. The action by the multiplicative group 
of real numbers on the fiber reduces to multiplication of vectors of the co- 
tangent space by a number. 
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On the cotangent bundle there is a distinguished 1-form “p dq.” There is 
an analogous 1-form on any manifold obtained by symplectification from a 
contact manifold. 


The canonical 1-form on the symplectified space 


Definition. The canonical 1-form in the symplectified space of a contact 
manifold is the differential 1-form « whose value on any vector & tangent 
to the symplectified space at some point p (Figure 237) is equal to the value 
on the projection of the vector ¢ onto the tangent plane to the contact 
manifold of the 1-form on this tangent plane which is the point p: 


a(S) = p(t), 


where x is the projection of the symplectified space onto the contact 


manifold. 


Figure 237 Symplectification of a contact manifold 


Theorem. The exterior derivative of the canonical 1-form on the symplectified 
space of a contact manifold is a nondegenerate 2-form. 


Corollary. The symplectified space of a contact manifold has a symplectic 
structure which is canonically (i.e., uniquely, without arbitrariness) deter- 
mined by the contact structure of the underlying odd-dimensional manifold. 


PROOF OF THEOREM. Since the assertions of the theorem are local, it is sufficient to prove it in 
a small neighborhood ofa point of the manifold. In a small neighborhood of a point on a contact 
manifold, a field of contact planes can be given by a differential form w on the contact manifold. 
We fix such a 1-form w. 

By the same token we can represent the symplectified space of the contact manifold over 
our neighborhood as the direct product of the neighborhood and the line minus a point. Namely, 
we associate to the pair (x, 2)-—where x is a point of the contact manifold and A is a nonzero 
number —-the contact form given by the differential 1-form Aw on the tangent space at the point x. 
Thus in the part of the symplectified space we are considering, we have defined a function 4 
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whose values are nonzero numbers. It should be emphasized that / is only a local coordinate on 
the symplectified manifold and that this coordinate is not defined canonically; it depends on 
the choice of differential 1-form @. The canonical 1-form 2 can be written in our notation as 


a = An*w 
and does not depend on the choice of w. The exterior derivative of the 1-form a thus has the form 
dz = dd a n*w + An*dw. 


We will show that the 2-form da is nondegenerate, i.c., that for any vector & tangent to 
the symplectification, we can find a vector y such that da(é,4) # 0. We select from vectors 
tangent to the symplectification, those of the following type. We call a vector € vertical if it 
is tangent to the fiber, ie. if z,¢ = 0. We call the vector € horizontal if it is tangent to a level 
surface of the function A, 1e., if dA(é) = 0. We call the vector € a contact vector if its projection 
onto the contact manifold lies in the contact plane, i.e., if w(2,¢) = 0(in other words, if «(é) = 0). 

We calculate the value of the form da on a pair of vectors (€, 7): 


do, y) = (dA A m¥o)(E, ) + (An*de)(C, 1). 


Assume that € is not a contact vector. For n, take a nonzero vertical vector, so that 2 1 =0. 
Then the second term is equal to zero, and the first term is equal to 


—dXn)o(n,¢) 


which is not zero since y is a nonzero vertical vector and € is not a contact vector. Thus if é 
is not a contact vector, we have found an n for which da(é, y) # 0. 

Now assume that ¢ is a contact vector and not vertical. Then for 7 we take any contact 
vector. Now the first term is entirely zero, and the second (and therefore the sum) is reduced 
to A dan, ¢, 7,1). Since € is not vertical, the vector 7,,¢ lying in the contact plane is not zero. 
But the 2-form dw is nondegenerate on the contact plane (by the definition of contact structure). 
Thus there is a contact vector 7 such that do(z,.¢, 1,1) # 0. Since A 4 0, we have found a 
vector 4 for which da(é, n) # 0. 

Finally, if the vector € is nonzero and vertical, then for 7 we can take any vector which is 
not a contact vector. O 


Remark. The constructions of the 1-form « and the 2-form da are valid 
for an arbitrary manifold with a field of hyperplanes, and do not depend on 
the condition of nondegeneracy. However, the 2-form da will define a 
symplectic structure only in the case when the field of planes is nondegenerate. 


Proor. Assume that the field is degenerate, i.e., that there exists a nonzero vector ¢' in a plane 
of the field such that dw(é’, n') = 0 for all vectors ny’ in this plane. For such a ¢’, the quantity 
dw(é', n') as a function of n’ is a linear form, identically equal to zero on the plane of the field. 
Therefore there is a number p not dependent on n’ such that 


da(E", 4’) = pon’) 


for all vectors n’ of the tangent space. 

We now take for € a tangent vector to the symplectified manifold for which 7,.€ = €'. Such 
a vector € is determined up to addition of a vertical summand, and we will show that for a suitable 
choice of this summand we will have 


da(é,n) = 0 for ally. 
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The first term of the formula for da is equal to dA(é)w(2,,) (since (2, €) = 0). The second 
term is equal to A dw(n, €, 14.4) = Auc(7,,n). We choose the vertical component of the vector € 
so that dA(é) = —Au. Then € will be skew-orthogonal to all vectors n. 

Thus if da is a symplectic structure, then the underlying field of hyperplanes is a contact 
structure. 


Corollary. The field of contact hyperplanes defines a contact structure on the 
manifold of all contact elements of any smooth manifold. 


Proor. The symplectification of the (2n — 1)-dimensional manifold of all 
contact elements on an n-dimensional smooth manifold, constructed with 
help of the field of (2n — 2)-dimensional contact planes, is by construction 
the space of the cotangent bundle of the underlying n-dimensional manifold 
without the zero cotangent vectors. The canonical 1-form « on the sym- 
plectification is, by its definition, the same 1-form on the cotangent bundle 
that we called “p dq” and which is fundamental in hamilton mechanics (cf. 
Section 37). Its derivative dx is therefore the form “dp A dq” defining the 
usual symplectic structure of a phase space. Therefore the form da is non- 
degenerate, and, by the preceding remark, the field of contact hyperplanes is 
nondegenerate. 


F Contact diffeomorphisms and vector fields 


Definition. A diffeomorphism of a contact manifold to itself is called a 
contact diffeomorphism if it preserves the contact structure, i.e., carries 
every plane of a given structure of a field of hyperplanes to a plane of the 
same field. 


EXAMPLE. Consider the (2n — 1)-dimensional manifold of contact elements 
of an n-dimensional smooth manifold with its usual contact structure. To 
each contact element we can ascribe a “positive side” by choosing one of the 
halves into which this element divides the tangent space to the n-dimensional 
manifold. 

We will call a contact element with a chosen side a (transversally) oriented 
contact element. 

The oriented contact elements on our n-dimensional manifold form a 
(2n — 1)- dimensional smooth manifold with a natural contact structure (it 
is a double covering of the manifold of ordinary nonoriented contact 
elements). 

Now assume that we are given a riemannian metric on the underlying 
n-dimensional manifold. Then there is a “geodesic flow”!®° on the manifold 
of oriented contact elements. The transformation after time t by this flow 
is defined as follows. We go out from the point of contact of a contact element 
along the geodesic orthogonal to it and directed to the side orienting the 
element. In the course of time t we will move the point of contact along the 


‘09 Strictly speaking, we need to require that the riemannian manifold be complete, i.e., geodesics 
can be continued without limit. 
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geodesic, keeping the element orthogonal to the geodesic. After time t we 
obtain a new oriented element. We have defined the geodesic flow of oriented 
contact elements. 


Theorem. The geodesic flow of oriented contact elements consists of contact 
diffeomorphisms. 


The proof of this theorem will not be presented since it is just a reformula- 
tion in new terms of Huygens’ principle (cf. Section 46). 


Definition. A vector field on a contact manifold is called a contact vector field 
if it is the velocity field of a one-parameter (local) group of contact 
diffeomorphisms. 


Theorem. The Poisson bracket of contact vector fields is a contact vector field. 
The contact vector fields form a subalgebra in the Lie algebra of all smooth 
vector fields on a contact manifold. 


The proof follows directly from the definitions. 


G Symplectification of contact diffeomorphisms 
and fields 


For every contact diffeomorphism of a contact manifold there is a canonically 
constructed symplectic diffeomorphism of its symplectification. This sym- 
plectic diffeomorphism commutes with the action of the multiplicative group 
of real numbers on the symplectified manifold and is defined by the following 
construction. 

Recall that a point of the symplectified manifold is a contact form on the 
underlying contact manifold. 


Definition. The image of a contact form p with point of contact x under the 
action of a contact diffeomorphism f of the contact manifold to itself is 
the form 


fip = fim) p. 


In simple terms, we carry the form p from the tangent space at the point x 
to the tangent space at f(x) using the diffeomorphism f (whose derivative at 
x determines an isomorphism between these two tangent spaces). The form 
fip is a contact form since the diffeomorphism f is a contact diffeomorphism. 


Theorem. The mapping f, defined above of the symplectification of a contact 
manifold to itself is a symplectic diffeomorphism which commutes with the 
action of the multiplicative group of real numbers and preserves the canonical 
1-form on the symplectification. 
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ProorF. The assertion of the theorem follows from the fact that the canonical 1-form, the symp- 
lectic 2-form, and the action of the group of real numbers are all determined by the contact 
structure itself (for their construction we did not use coordinates or any other noninvariant 
tools), and the diffeomorphism f preserves the contact structure. It follows from this that f, 
preserves all that which was invariantly constructed using the contact structure, in particular 
the 1-form a, its derivative dx, and the action of the group. 


Theorem. Every symplectic diffeomorphism of the symplectification of a contact 
manifold which commutes with the action of the multiplicative group (1) 
projects onto the underlying contact manifold as a contact diffeomorphism 
and (2) preserves the canonical 1-form «. 


Proor. Every diffeomorphism which commutes with the action of the multiplicative group 
projects onto some diffeomorphism of the contact manifold. To show that this is a contact 
diffeomorphism it is sufficient to prove the second assertion of the theorem (since only those 
vectors for which a(€) = 0 project onto the contact plane). 

To prove the second assertion we express the integral of the form along any path y in terms 


of the symplectic structure da: 
a=lim iil da, 
y 670 a(e) 


where the 2-chain o(é) is obtained from y by multiplication by all numbers in the interval [e, 1]. 
The boundary of o contains, besides y, two vertical intervals and the path ey. The integrals of « 
over the vertical intervals are equal to zero, and the integral over ey approaches 0 as é does. 

Now from the invariance of the 2-form da and the commutativity of our diffeomorphism F 
with multiplication by numbers it follows that for any path 


* ~ | a 
Fy ? 


and thus the diffeomorphism F preserves the |-form a. Oo 


Definition. The symplectification of a contact vector field is defined by the 
following construction. Consider the field as a velocity field of a one- 
parameter group of contact diffeomorphisms. Symplectify the diffeomor- 
phisms. Consider the velocity field of this group. It is called the sym- 
plectification of the original field. 


Theorem. The symplectification of a contact vector field is a hamiltonian vector 
field. The hamiltonian can be chosen to be homogeneous of first order with 
respect to the action of multiplication by the group of real numbers: 


H(Ax) = AH(x). 


Conversely, every hamiltonian field on a symplectified contact manifold, 
having a hamiltonian which is homogeneous of degree 1, projects onto the 
underlying contact manifold as a contact vector field. 
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Proor. The fact that symplectifications of contact diffeomorphisms are 
symplectic implies that the symplectification of a contact field is hamil- 
tonian. The homogeneity of the hamiltonian follows from the homogeneity of 
symplectic diffeomorphisms (from commutativity with multiplication by 4). 
Thus the first assertion of the theorem follows from the theorem on sym- 
plectifications of contact diffeomorphisms. The second part follows in the 
same way from the theorem on homogeneous symplectic diffeomorphisms. 


O 


Corollary. Symplectification of vector fields is an isomorphic map of the Lie 
algebra of contact vector fields onto the Lie algebra of all locally hamiltonian 
vector fields with hamiltonians which are homogeneous of degree 1. 


The proof is clear. 


H Darboux’s theorem for contact structures 


Darboux’s theorem is a theorem on the local uniqueness of a contact struc- 
ture. It can be formulated in any of the following three ways. 


Theorem. All contact manifolds of the same dimension are locally contact 
diffeomorphic (i.e., there is a diffeomorphism of a sufficiently small neighbor- 
hood of any point of one contact manifold onto a neighborhood of any point 
of the other which carries the noted point of the first neighborhood to the 
noted point of the second and the field of planes in the first neighborhood to 
the field of planes in the second). 


Theorem. Every contact manifold of dimension 2m — 1 is locally contact 
diffeomorphic to the manifold of contact elements of m-dimensional space. 


Theorem. Every differential 1-form defining anondegenerate field of hyperplanes 
on a manifold of dimension 2n + 1, can be written in some local coordinate 
system in the “normal form” 


o=xdy+ dz, 


where x = (X1,.--5Xn)) ¥ = (V1,---3 Yn) and z are the local coordinates. 


It is clear that the first two theorems follow from the third. We will deduce 
the third one from an analogous theorem of Darboux on the normal form of 
the 2-form giving a symplectic structure (cf. Section 43). 


PROOF OF DARBOUX’S THEOREM. We symplectify our manifold. On this new (2n + 2)-dimensional 
symplectic manifold there are a canonical 1-form a, a nondegenerate 2-form da, a projection 2 
onto the underlying contact manifold and a vertical direction at every point. 

The given differential 1-form @ on the contact manifold defines a contact form at every 
point. These contact forms form a (2n + 1)-dimensional submanifold of the symplectic mani- 
fold. The projection x maps this submanifold diffeomorphically onto the underlying contact 
manifold, and the verticals intersect this submanifold at a nonzero angle. 
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Consider a point in the surface just constructed (in the symplectic manifold) lying over the 
point of the contact manifold we are interested in. In the symplectic manifold we can choose a 
local system of coordinates near this point such that 


da = dpo \ dqo + ++: + apy A dy 


and such that the coordinate surface po = 0 coincides with our (2n + 1)-dimensional manifold 
(cf. Section 43, where in the proof of the symplectic Darboux’s theorem the first coordinate may 
be chosen arbitrarily). 

We note now that the 1-form po dqo + --- p, dq, has derivative dx. Thus, locally, 


& = Po dq +++ + Py, dq, + dw, 


where w is a function which can be taken to be zero at the origin. In particular, on the surface 
Po = Othe form « takes the form 


a|po=0 = Pi dq, +--+ + py dq, + dw. 


The projection z allows us to carry the coordinates py, ..., Pai doi qi.---» 9, and the function 
w onto the contact manifold. More precisely, we define functions x, y, and z by the formulas 


x(1A) = pA) y{mA) = qA) 2(mA) = w(A), 


where A is a point on the surface py = 0. 
Then we obtain 


w=xdy+dz 


and it remains only to verify that the functions (x,,...,X,3 Yi.--+> Yai 2) form a coordinate 
system. For this it is sufficient to verify that the partial derivative of w with respect to gp is not 
zero, or in other words that the 1-form « is not zero on a vector of the coordinate direction qo. 
The latter is equivalent to the 2-form dx being nonzero on the pair of vectors: the basic vector 
in the direction of go and the vertical vector. 

But a vector in the coordinate direction qo is skew-orthogonal to all vectors of the coordinate 
plane po = 0. If it was also skew-orthogonal to the vertical vector, then it would be skew- 
orthogonal to all vectors, which contradicts the nondegeneracy of dx. Thus @w/éqo # 0, and the 
theorem is proved. O 


I Contact hamiltonians 


Suppose that the contact structure of a contact manifold is given by a dif- 
ferential 1-form @, and that this form is fixed. 


Definition. The w-embedding of the contact manifold into its symplectification 
is the map associating to a point of the contact manifold the restriction of 
the form w on the tangent plane at this point. 


Definition. The contact hamiltonian function of a contact vector field on a 
contact manifold with fixed 1-form @ is the function K on the contact 
manifold whose value at each point is the value of the homogeneous 
hamiltonian H of the symplectification of the field on the image of the 
given point under the w-embedding: 


K(A) = H(|,). 
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Theorem. The contact hamiltonian function K of a contact vector field X ona 
contact manifold with a given 1-form w is equal to the value of the form w 
on this contact field: 


K = o@(X). 


Proor. We use the expression for the increment of the ordinary hamiltonian function over a 
path in terms of the vector field and the symplectic structure (Section 48C). For this we draw a 
vertical interval {AB}, 0 <A <1, through the point B of the symplectification at which we 
want to calculate the hamiltonian function. The translations of this interval over small time t 
under the action of the symplectified flow defined by our field X, fill out a two-dimensional 
region o(t). The value of the hamiltonian at the point B is equal to the limit 


H(B) = lim t7! {f da, 
t0 a(t) 

since H(AB) > 0 as 2 > 0. But the integral of the form dx over the region is the integral of 
the 1-form «along the edge formed by the trajectory of the point B (the other parts of the boundary 
give zero integrals). Therefore, the double integral is simply the integral of the 1-form « along 
the interval of trajectories, and the limit is the value of x on the velocity vector Y of the symplec- 
tified field. Thus K(zB) = H(B) = x(Y) = w(X), as was to be shown. 

J Computational formulas 


Suppose now that we make use of the coordinates in Darboux’s theorem in 
which the form @ has the normal form 


@=xdy+ dz, X = (X4,---5 Xn) V = OW, -- +s Vad 


PRoBLem. Find the components of the contact field with a given contact 
hamiltonian function K = K(x, y, z). 
ANSWER. The equations of the contact flow have the form 
x= —K,+xK, 
y=K, 
K — xK,. 


“ 


Solution. A point of the symplectification can be given by the 2n + 2 numbers x;, yj, z, 
and A, where (x, y, z) are the coordinates of a point of the contact manifold and A is the number 
by which we must multiply « to obtain the given point of the symplectified space. 

In these coordinates x = Ax dy + Adz. Therefore, in the coordinate system p, q, where 


P=(P. Po) =p = Ax. Po =A 
Gq= (440) = Ys 40=5; 
the form x takes the standard form: 
x= pdq da = dp A dq. 
The action T, of the multiplicative group is now reduced to multiplication of p by a number: 


T,(P. 4) = (4p. q). 
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The contact hamiltonian K can be expressed in terms of the ordinary hamiltonian 
H = H(p, q, Po: Yo) by the formula 


K(x, y, 2) = H(x, y, 1, z). 


The function H is homogeneous of degree | in p. Therefore, the partial derivatives of K at the 
point (x, y, =) are related to the derivatives of H at the point (p = x, po = 1,q = y, do = 2) by 
the relations 


H,= kK, H,, = K., 


H,=K,  H,,=K —xK, 


Po 


Hamilton’s equations with hamiltonian function H therefore have the following form at the 
point under consideration: 
x+xA= —K, A= —K,, 


y= K,, Z2= K — xK,, 


from which we obtain the answer above. 


PROBLEM. Find the contact hamiltonian of the Poisson bracket of two contact 
fields with contact hamiltonians K and K’. 


ANSWER. (K, K’') + K,EK' — KEK, where the brackets denote Poisson 
bracket in the variables x and y and E is the Euler operator EF = F — xF,. 


Solution. In the notation of the solution of the preceding problem we must express the 
ordinary Poisson bracket of the homogeneous hamiltonians H and H’ at the point 
(p =X, Po = 1,q = y, Zo = 2) in terms of the contact hamiltonians K and K’. We have 

(H, H') = HH, — HH, = H,H, — HH, + Hy,H5, — Hp,Hag- 
Substituting the values of the derivatives from the preceding problem, we find at the point under 
consideration 
(H, H’) = K,K. — K,K, + K(K' — xK‘,) — KK — xK,). 


K_ Legendre manifolds 


The lagrangian submanifolds of a symplectic phase space correspond in the 
contact case to an interesting class of manifolds which may be called Legendre 
manifolds since they are closely related to Legendre transformations. 


Definition. A Legendre submanifold of a (2n + 1)-dimensional contact mani- 
fold is an n-dimensional integral manifold of the field of contact planes. 


In other words, it is an integral manifold of the highest possible dimension 
for a nondegenerate field of planes. 


EXAMPLE 1. The set of all contact elements tangent to a submanifold of any 
dimension in an m-dimensional manifold is an (m — 1)-dimensional Legendre 
submanifold of the (2m — 1)-dimensional contact manifold of all contact 
elements. 
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EXAMPLE 2. The set of all planes tangent to the graph of a function f = (x) 
in an (n + 1)-dimensional euclidean space with coordinates (x,,..., x,; f) 
is a Legendre submanifold of the (2n + 1)-dimensional space of all non- 
vertical hyperplane elements in the space of the graph (the contact structure 
is given by the 1-form 


@ = py dx, +-:- +p, dx, — df; 


the element with coordinates (p, x, f) passes through the point with co- 
ordinates (x, f) parallel to the plane f = pyx; +--+: + DyXq): 


The Legendre transformation can be described in these terms in the 
following way. 

Consider a second (2n + 1)-dimensional contact space with coordinates 
(P, X, F) and contact structure given by the form 


Q= PdX — dF. 


The Legendre involution is the map taking a point of the first space with 
coordinates (p, x, f) to the point of the second space with coordinates 


Pax X=p F=px-f. 


The Legendre involution, as can be easily calculated, carries the first 
contact structure to the second. Clearly, we have 


Theorem. A diffeomorphism of one contact manifold onto another which carries 
contact planes to contact planes, carries every Legendre manifold to a 
Legendre manifold. 


In particular, under the action of the Legendre involution the Legendre 
manifold of plane elements tangent to the graph of a function is carried into a 
new Legendre manifold. This new manifold is called the Legendre transform 
of the original manifold. 

The projection of the new manifold onto the space with coordinates (X, F) 
(parallel to the P-direction) is in general not a smooth manifold, but has 
singularities. This projection is called the Legendre transform of the graph of 
the function ©. 

If the function @ is convex, then the projection is itself the graph of a 
function F = O(X). In this case ® is called the Legendre transform of the 
function 9. 

As another example we consider the motion of oriented contact elements 
under the action of the geodesic flow on a riemannian manifold. As the 
“initial wave front” we take some smooth submanifold of our riemannian 
manifold (the dimension of the submanifold is arbitrary). The oriented con- 
tact elements tangent to this submanifold form a Legendre manifold in the 
space of all contact elements. From the preceding theorem we obtain 


366 


Appendix 4: Contact structures 


Corollary. The family of all elements tangent to a wave front is transformed 
under the action of the geodesic flow after time t to a Legendre manifold of 
the space of all contact elements. 


It should be noted that this new Legendre manifold may not be the family 
of all elements tangent to some smooth manifold, since a wave front may 
develop singularities. 

The Legendre singularities which arise in this way can be described in a 
manner similar to lagrangian singularities (cf. Appendix 12). A Legendre 
fibration of a (2n + 1)-dimensional contact manifold is a fibration all of 
whose fibers are n-dimensional Legendre manifolds. A Legendre singularity 
is a singularity of the projection of an n-dimensional Legendre submanifold 
of a (2n + 1)-dimensional contact manifold onto the (n + 1)-dimensional 
base of the Legendre fibration. 

Consider the space R?"*! with contact structure given by the form 
a = x dy + dz, where x = (x1,..., x,) and y = (j4,..., y,). The projection 
(x, y, z) > (y, z) gives a Legendre fibration. 

An equivalence of Legendre fibrations is a diffeomorphism of the total 
spaces of the fibrations carrying the contact structure and fibers of the first 
bundle to the contact structure and fibers of the second bundle. It can be 
shown that every Legendre bundle is equivalent to the special bundle just 
described in a neighborhood of every point of the space of the bundle. 

The contact structure of the total space of fibration gives the fibers a local 
structure of a projective space. Legendre equivalence preserves this structure, 
ie., defines locally projective fiber transformations. 

The following theorem allows us to locally describe Legendre sub- 
manifolds and maps by using generating functions. 


Theorem. For any partition I + J of the set of indices (1, ..., n) into two dis- 
joint subsets and for any function S(x;, y;) of n variables x;,i¢ 1, j ¢J, the 
formulas 


BE age EE Bye EEE ies 
define a Legendre submanifold of R?"*'. Conversely, every Legendre sub- 


manifold of R?"* ' is defined in aneighborhood of every point by these formulas 
for at least one of the 2” possible choices of the subset I. 


The proof is based on the fact that, on a Legendre manifold, dz + xdy = 0, 
so d(z + xyz) = yy dx; — xy dyy. O 


In the formulas of the preceding theorem, we replace S by a function from 
the list of the simple lagrangian singularities given in Appendix 12. We 
obtain Legendre singularities which are preserved under small deformations 
of the Legendre mapping (x, y, z) > (y, z) (ie., are carried to equivalent 
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singularities for small deformations of the function S). Every Legendre 
mapping for n < 6 can be approximated by a map, all of whose singularities 
are locally equivalent to singularities from the list A, (1 <k <6), D, 
(4<k <6), Eg. 

In particular, we obtain a list of the singularities of a wave front in general 
position in spaces of dimension less than 7. 

In ordinary three-space this list is as follows: 


A,;:S=+x? Az: S=+x? Az: S=+xt4+ x?y, 


where J = {1}, J = {2}, andn = 2. 

The projections of the Legendre manifolds indicated here onto the base 
of the Legendre bundle (i.e., onto the space with coordinates y,, yz, and z) 
are: a simple point in the case of A,, a cuspidal edge in the case of A,, and a 
swallowtail (cf. Figure 246) in the case of A3. 

Thus a wave front in general position in three-space has only cusps and 
“swallowtail” points as singularities. At isolated moments of time during the 
motion of the front we can observe transitions of the three types A,, D, and 
Dj (cf. Appendix 12, where the corresponding caustics filled out by the 
singularities of the front during its motion are drawn). 


PRoBLEM 1. Lay out an interval of length t on every interior normal to an ellipse in the plane. 
Draw the curve obtained and investigate its singularities and its transitions as t changes. 


PROBLEM 2. Do the same thing for a triaxial ellipsoid in three-dimensional space. 


L Contactification 


Along with symplectification of contact manifolds, there is a contactification 
of symplectic manifolds with symplectic structure cohomologous to zero. 

The contactification E?"*1 of the symplectic manifold (M*", w*) is con- 
structed as the space of a bundle with fiber R over M2". Let U bea sufficiently 
small neighborhood of a point x in M, so that there is a canonical coordinate 
system p,q on U with w = dp ” dq. Consider the direct product U x R 
with coordinates p, q, z. Let V x R be the same kind of product constructed 
on another (or the same) neighborhood V, with coordinates P,Q,Z;dP * dQ 
= w. If the neighborhoods U and V on M intersect, then we identify the 
fibers above the points of intersection in both representations so that the 
form dz + pdgq = dZ + P dQ = « is defined on the whole (this is possible 
since P dQ — p dq isa total differential on U 4 V). 

It is easy to verify that after this pasting together we have a bundle E?"*! 
on M?" and that the form «@ defines a contact structure on E. The manifold E 
is called the contactification of the symplectic manifold M. If the cohomology 
class of the form w? is integral, then we can define a contactification with 
fiber S'. 
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M Integration of first-order partial differential equations 


Let M?"*! be a contact manifold, and E?" a hypersurface in M?"*!. The 
contact structure on M defines some geometric structure on E—in particular, 
the field of so-called characteristic directions. An analysis of this geometric 
structure can reduce the integration of general first-order nonlinear partial 
differential equations to the integration of a system of ordinary differential 
equations. 

We assume that the manifold E?” is transverse to the contact planes at all 
its points. In this case, the intersection of the tangent plane to E7" at each of its 
points with the contact plane has dimension 2n — 1, so that we have a field 
of hyperplanes on E?". Furthermore, the contact structure on M?"*?! defines 
on E?" a field of lines lying in these (2n — 1)-dimensional planes. 

In fact, let « be a 1-form on M2"*! locally giving the contact structure; 
let w = da and let R2" be a contact plane at the point x in E?". Let ® = 0 
be the local equation of E?" (so d® is not zero at x). The restriction of d® to 
IR?" defines a nonzero linear form on R?". The 2-form w gives R?” the structure 
of a symplectic vector space and thus an isomorphism of this space with its 
dual. The nonzero 1-form d®|g2n corresponds to a nonzero vector € of R?", 
so that d®(-) = w(E, -). The vector € is called the characteristic vector of the 
manifold E?" at the point x. The characteristic vector ¢ lies in the inter- 
section of R?" with the tangent plane to E?”, so that d®(é) = 0. 

The vector € is not uniquely defined by the manifold E?”" and the contact 
structure on M, but only up to multiplication by a nonzero number. In fact, 
like the 2-form @ on R?", the 1-form d® on R?" is defined only up to multi- 
plication by a nonzero number. 

The direction of the characteristic vector (i.e., the line containing it) is 
determined uniquely by the contact structure at every point of the manifold 
E. Thus we have a field of characteristic directions on the hypersurface E of 
the contact manifold M. The integral curves of this field of directions are 
called the characteristics. 

Now suppose we are given an (n — 1)-dimensional submanifold J of our 
hypersurface E?", which is integral for the contact field (so that the tangent 
plane to J at each point is contained in the contact plane). 


Theorem. If at a point x of I the characteristic on E*" is not tangent to I, then 
in a neighborhood of the point x the characteristics on E?" passing through 
points of I form a Legendre submanifold L" in M?"*}, 


Proor. Let € be a vector field on E?” made up of characteristic vectors. By 
the homotopy formula (cf. Section 36G) we have on E?" 


Leo = diz + ig da. 


But isa = 0 since the characteristic vector belongs to the contact plane. 
Therefore, on E*" we have L,« = igw. But the 1-form iq is zero on the 
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intersection of the tangent plane to E?" with the contact plane (since on the 
contact plane i,w = d®, and on the tangent plane d® = 0). Therefore, on 
the tangent plane to E*” we have iw = ca. Thus on the hypersurface E, 


Lea = ca 


(where c is a function smooth in a neighborhood of x). 

Now let {g‘} be the (local) phase flow of the field ¢ and 4 a vector tangent 
to E?". Set n(t) = gin and y(t) = a(n(t)). Then the function y satisfies the 
linear differential equation 


dy _ 
at c(t) y(t). 


If 7(0) is tangent to J, then y(O) = a(7(0)) = 0. This means y(t) = a(n(t)) 
= 0, ie., for all t, n(t) lies in the contact plane. Therefore, g'I is an integral 
manifold of the contact field. Therefore the manifold formed by all {g'I} for 
small t is a Legendre manifold. | 


ExampLe. Consider R2"*! with coordinates x,,..-, X_3 Dis «+> Dnj U With 
contact structure defined by the 1-form « = du — p dx. A function ®(x, p, u) 
defines a differential equation ®(x, du/éx, u) = 0 and a submanifold E = 
®~1(0) in the space R?"*! (called the space of 1-jets of functions on R”). 

An initial condition for the equation ® = 0 is an assignment of a value f 
to the function u on an (n — 1)-dimensional hypersurface I in the n-dimen- 
sional space with coordinates x,,..., X,- 

An initial condition determines the derivatives of u in the n — | indepen- 
dent directions at each point of I’. The derivative in a direction transverse to 
I can generally be found from the equation; if the conditions of the implicit 
function theorem are fulfilled, then the initial condition is called noncharacter- 
istic. 

A noncharacteristic initial condition defines an (n — 1)-dimensional inte- 
gral submanifold J of the form (the graph of the mapping u = f(x), p = p(x), 
x €T). The characteristics on E intersecting J form a Legendre submanifold 
of R2"*!, the graph of the mapping u = u(x), p = 0u/0x. The function u(x) 
is a solution of the equation ®(x, du/0x, u) = 0 with initial condition ulr = f- 

Note that to find the function u we need only solve the system of 2n first- 
order ordinary differential equations for the characteristics on E, and perform 
a series of “algebraic” operations. 
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By the theorem of E. Noether, one-parameter groups of symmetries of a 
dynamical system determine first integrals. If a system admits a larger group 
of symmetries, then there are several integrals. Simultaneous level manifolds 
of these first integrals in the phase space are invariant manifolds of the phase 
flow. The subgroup of the group of symmetries mapping such an invariant 
manifold into itself acts on the manifold. In many cases, we can look at the 
quotient manifold of an invariant manifold by this subgroup. This quotient 
manifold, called the reduced phase space, has a natural symplectic structure. 
The original hamiltonian dynamical system induces a hamiltonian system 
on the reduced phase space. 

The partition of the phase space into simultaneous level manifolds 
generally has singularities. An example is the partition of a phase plane into 
energy level curves. 

In this appendix we will briefly discuss dynamical systems in reduced 
phase space and their relationship with invariant manifolds in the original 
space. All these questions were investigated by Jacobi and Poincaré (“elimin- 
ation of the nodes” in the many-body problem, “reduction of order” in 
systems with symmetries, “stationary rotations” of rigid bodies, etc.). A 
detailed presentation in current terminology can be found in the following 
articles: S. Smale, “Topology and mechanics,” Inventiones Mathematicae 
10:4 (1970) 305-331, 11:1 (1970), 45-64; and J. Marsden and A. Weinstein, 
“Reduction of symplectic manifolds with symmetries,” Reports on Mathe- 
matical Physics 5 (1974) 121-130. 


A. Poisson action of Lie groups 


Consider a symplectic manifold (M2", w”) and suppose a Lie group G acts 
on it as a group of symplectic diffeomorphisms. Every one-parameter sub- 
group of G then acts as a locally hamiltonian phase flow on M. In many 
important cases, these flows have single-valued hamiltonian functions. 


ExampPLe. Let V be a smooth manifold and G some Lie group of diffeomorphisms of V. Since 
every diffeomorphism takes 1-forms on V to 1-forms, the group G acts on the cotangent bundle 
M = T*V. 

Recall that on the cotangent bundle there is always a canonical 1-form & (“pdq”) and a 
natural symplectic structure «w = da. The action of the group G on M is symplectic since it 
preserves the 1-form « and hence also the 2-form da. 

A one-parameter subgroup {g‘} of G defines a phase flow on M. It is easy to verify that this 
phase flow has a single-valued hamiltonian function. In fact, the hamiltonian function is given 
by the formula from Noether’s theorem: 


d 
A(x) = (¢ a) where x € M. 
dt |r=0 


We now assume that we are given a symplectic action of a Lie group G 
on a connected symplectic manifold M such that, to every element a of the 
Lie algebra of G, there corresponds a one-parameter group of symplectic 
diffeomorphisms with a single-valued hamiltonian H,. These hamiltonians 
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are determined up to the addition of constants which can be chosen so that 
the dependence of H, upon a is linear. To do this, it is sufficient to choose 
arbitrarily the constants in the hamiltonians for a set of basis vectors of the 
Lie algebra of G, and to then define the hamiltonian function for each element 
of the algebra as a linear combination of the basis functions. 

Thus, given a symplectic action of a Lie group G and a single-valued 
hamiltonian on M, we can construct a linear mapping of the Lie algebra of 
G into the Lie algebra of hamiltonian functions on M. The function Hyja, 5 
associated to the commutator of two elements of the Lie algebra is equal to 
the Poisson bracket (H,, H,), or else it differs from this Poisson bracket by a 
constant: 


Aya, = (Ha, Hy) + Cla, b). 


Remark. The appearance of the constant C in this formula is a consequence of an interesting 
phenomenon: the existence of a two-dimensional cohomology class of the Lie algebra of 
(globally) hamiltonian fields. 

The quantity C(a, b) is a bilinear skew-symmetric function on the Lie algebra. The Jacobi 
identity gives us 


C([a, b], c) + C(Lb, c], a) + C([c, a], b) = 0. 


A bilinear skew-symmetric function on a Lie algebra with this property is called a two-dimensional 
cocycle of the Lie algebra. 

If we choose the constants in the hamiltonian functions differently, then the cocycle C is 
replaced by C’, where 


C'(a, b) = C(a, b) + p([a, b]) 


where p is a linear function on the Lie algebra. Such a cocycle C’ is said to be cohomologous to 
the cocycle C. A class of cocycles which are cohomologous to one another is called a cohomology 
class of the Lie algebra. 

Thus, a symplectic action of a group G for which single-valued hamiltonians exist defines a 
two-dimensional cohomology class of the Lie algebra of G. This cohomology class measures 
the deviation of the action from one in which the hamiltonian function of a commutator can be 
chosen equal to the Poisson bracket of the hamiltonian functions. 


Definition. An action of a connected Lie group on a symplectic manifold is 
called a Poisson action if the hamiltonian functions for one-parameter 
groups are single-valued, and chosen so that the hamiltonian function 
depends linearly on elements of the Lie algebra and so that the hamiltonian 
function of a commutator is equal to the Poisson bracket of the hamil- 
tonian functions: 


Aya.) ae (H,, A,). 


In other words, a Poisson action of a group defines a homomorphism from 
the Lie algebra of this group to the Lie algebra of hamiltonian functions. 
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ExampLe. Let V be a smooth manifold and G a Lie group acting on Vas a group of diffeo- 
morphisms. Let M = T*V be the cotangent bundle of the manifold V with the usual symplectic 
structure w = da. The hamiltonian functions of one-parameter groups are defined as above: 


qd) Hx) = (5 
ae acues Vr 


ix). xeT*y. 


1=0 


Theorem. This action is Poisson. 


ProoF. By definition of the 1-form «, the hamiltonian functions H, are linear “in p” (i.e., on 
every cotangent space). Therefore, their Poisson brackets are also linear. Thus the function 
Ata.) — (Ha, Hp) 1s linear in p. Since it is constant, it is equal to zero. O 

In the same way, we can show that the symplectification of any contact action is a Poisson 
action. 


ExamPLe. Let V be three-dimensional euclidean space and G the six-dimensional group of its 
motions. The following six one-parameter groups form a basis of the Lie algebra: the trans- 
lations with velocity 1 along the coordinate axes q,, q2, and q3 and the rotations with angular 
velocity | around these axes. By formula (1), the corresponding hamiltonian functions are (in 
the usual notation) p,, p2, p3: M;, M2, M3, where M, = qzp3 — q3pz2, etc. The theorem im- 
plies that the pairwise Poisson brackets of these six functions are equal to the hamiltonian 
functions of the commutators of the corresponding one-parameter groups. 


A Poisson action of a group G on a symplectic manifold M defines a 
mapping of M into the dual space of the Lie algebra of the group 


P:M > g*. 


That is, we fix a point x in M and consider the function on the Lie algebra 
which associates to an element a of the Lie algebra the value of the Hamil- 
tonian H, at the fixed point x: 


px(a) = H,{x). 


This p, is a linear function on the Lie algebra and is the element of the dual 
space to the algebra associated to x: 


P(x) = Px. 


Following Souriau (Structure des systémes dynamiques, Dunod, 1970), we 
will call the mapping P the momentum. Note that the value of the momentum 
is always a vector in the space g*. 


EXAMPLE. Let V be a smooth manifold, G a Lie group acting on V as a group of diffeomorphisms, 
M = T*V the cotangent bundle and H, the hamiltonian functions constructed above of the 
action of G on M (cf. (1)). 

Then the “momentum” mapping P: M — g* can be described in the following way. Con- 
sider the map ®: G > M given by the action of all the elements of G on a fixed point x in M 
(so ®(g) = gx). The canonical 1-form « on M induces a |-form ®*z on G. Its restriction to the 
tangent space at the identity of G is a linear form on the Lie algebra. 

Thus to every point x in M we have associated a linear form on the Lie algebra. It is easy 
to verify that this mapping is the momentum of our Poisson action. 
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In particular, if V is euclidean three-space and G is the group of rotations around the point 0, 
then the values of the momentum are the usual vectors of angular momentum: if G is the group 
of rotations around an axis, then the values of the momentum are the angular momenta relative 
to this axis; if G is the group of parallel translations, then the values of the momentum are the 
vectors of linear momentum. 


Theorem. Under the momentum mapping P, a Poisson action of a connected 
Lie group G is taken to the co-adjoint action of G on the dual space g* of its 
Lie algebra (cf. Appendix 2), i.e., the following diagram commutes: 


M M 


P P 
Ad*, 


g* Sos ns g* 


Corollary. Suppose that a hamiltonian function H: M — R is invariant under 
the Poisson action of a group G on M. Then the momentum is a first integral 
of the system with hamiltonian function H. 


PROOF OF THE THEOREM The theorem asserts that the hamiltonian function H, of the one- 


parameter group h' is carried over by the diffeomorphism g to the hamiltonian function H 44,4 


of the one-parameter group gh'g~'. 


Let g° be a one-parameter group with hamiltonian function H,. It is sufficient to show that 
the derivatives with respect to s (for s = 0) of the functions H,(g*x) and H 4,,,,(x) are the same. 
The first of these derivatives is the value at x of the Poisson bracket (H,, H,). The second is 
Hy, »(x). Since the action is Poisson, the theorem is proved. O 


PROOF OF THE COROLLARY. The derivative, in the direction of the phase flow with hamiltonian 
function H, of each component of the momentum is zero, since it is equal to the derivative of 
function H in the direction of the phase flow corresponding to a one-parameter subgroup of G. 


O 


B The reduced phase space 


Suppose that we are given a Poisson action of a group G on a symplectic 
manifold M. Consider a level set of the momentum, i.e., the inverse image of 
some point p €g* under the map P. We denote this set by M,, so that 
(Figure 238) 


M, = P~'(p). 


In many important cases the set M, is a manifold. For example, this will 
be so if pis a regular value of the momentum, i.e., if the differential of the map P 
at each point of the set M, maps the tangent space to M onto the whole 
tangent space to g*. 

In general, a Lie group G acting on M takes the sets M, into one another. 
However, the stationary subgroup of a point p in the co-adjoint representa- 
tion (i.e., the subgroup consisting of those elements g of the group G for 
which Ad*p = p) leaves M, fixed. We denote this stationary subgroup by 
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Figure 238 Reduced phase space 


G,. The group G, is a Lie group, and it acts on the level set M, of the mo- 
mentum. 

The reduced phase space is obtained from M, by factoring by the action 
of the group G,. In order for such a factorization to make sense, it is necessary 
to make several assumptions. For example, it is sufficient to assume that 


1. pis a regular value, so that M, is a manifold, 
2. The stationary subgroup G, is compact, and 
3. The elements of the group G, act on M, without fixed points. 


Remark. These conditions can be weakened. For example, instead of compactness of the 
group G, we can require that the action be proper (i.e., that the inverse images of compact sets 
under the mapping (g, x) — (g(x), x) are compact). For example, the actions of a group on 
itself by left and right translation are always proper. 


If conditions (1), (2), and (3) are satisfied, then it is easy to give the set of 
orbits of the action of G, on M, the structure of a smooth manifold. Namely, 
a chart on a neighborhood of a point x € M, is furnished by any local trans- 
versal to the orbit G,x, whose dimension is equal to the codimension of the 
orbit. 

The resulting manifold of orbits is called the reduced phase space of a 
system with symmetry. 

We will denote the reduced phase space corresponding to a value of the 
momentum by F,,. The manifold F, is the base space of the bundle z: M, > F, 
with fiber diffeomorphic to the group G,. 

There is a natural symplectic structure on the reduced phase space F,. 
Namely, consider any two vectors ¢ and n tangent to F, at the point f. The 
point f is one of the orbits of the group G, on the manifold M,. Let x be 
one of the points of this orbit. The vectors ¢ and y tangent to F,, are obtained 
from some vectors ¢' and n’ tangent to M, at some point x by the projection 
1: M,—> Fy. 


Definition. The skew-scalar product of two vectors € and n which are tangent 
to a reduced phase space at the same point, is the skew-scalar product of 


375 


Appendix 5: Dynamical systems with symmetries 


the corresponding vectors ¢’ and n’, tangent to the original symplectic 
manifold M: 


(¢, 7], = [¢’,n']. 


Theorem.!°! The skew-scalar product of the vectors € and n does not depend 
on the choices of the point x and representatives ¢' and n’, and gives a 
symplectic structure on the reduced phase space. 


Corollary. The reduced phase space is even-dimensional. 


PROOF OF THE THEOREM. We look at the following two spaces in the tangent 
space to M at x: 


T(M,), the tangent space to the level manifold M,, and 
T(G,), the tangent space to the orbit of the group G. 


Lemma. These two spaces are skew-orthogonal complements to one another 
in TM. 


Proor. A vector ¢ lies in the skew-orthogonal complement to the tangent plane of an orbit of 
the group G if and only if the skew-scalar product of the vector ¢ with velocity vectors of the 
hamiltonian flow of the group G is equal to zero (by definition). But these skew-scalar products 
are equal to the derivatives of the corresponding hamiltonian functions in the direction ¢. 
Therefore, the vector ¢ lies in the skew-orthogonal complement to the orbit of G if and only if 
the derivative of the momentum in the direction ¢ is equal to zero, ie., if ¢ lies in T(M,). O 


The representatives ¢’ and n’ are defined up to addition of a vector from the tangent plane 
to the orbit of the group G,. But this tangent plane is the intersection of the tangent planes to 
the orbit Gx and to the manifold M, (by the last theorem of part A). Consequently, the addition 
to € of a vector from T(G,x) does not change the skew-scalar product with any vector 1’ from 
T(M ,) (since by the lemma T(G,x) is skew-orthogonal to T(M,)). Thus, we have shown the 
independence from the representatives ¢’ and n’. 

The independence of the quantity [¢, 7], from the choice of the point x of the orbit f follows 
from the symplectic nature of the action of the group G on M and the invariance of M,. Thus 
we have defined a differential 2-form on F,: 


(6.9) = (6, 0],- 


It is nondegenerate, since if [¢, 7], = 0 for every n, then the corresponding representative 
€ is skew-orthogonal to all vectors in T(M,). Therefore, ¢’ must be the skew-orthogonal com- 
plement to T(M,) in TM. Then by the lemma €'€ T(Gx), ie, € = 0. 

The form Q, is closed. In order to verify this we consider a chart, i.e., a piece of submanifold 
in M,, transversally intersecting the orbit of the group G, in one point. 

The form Q, is represented in this chart by a 2-form induced from the 2-form « which defines 
the symplectic structure in the whole space M, by means of the embedding of the submanifold 
piece. Since the form w is closed, the induced form is also closed. The theorem is proved. 


101 The theorem was first formulated in this form by Marsden and Weinstein. Many special 
cases have been considered since the time of Jacobi and used by Poincaré and his successors in 
mechanics, by Kirillov and Kostant in group theory, and by Faddeev in the general theory of 
relativity. 
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EXAMPLE |. Let M = R2" be euclidean space of dimension 2n with coordin- 
ates p,, q, and 2-form )' dp, A dq,. Let G = S! be the circle, and let the 
action of G on M be given by the hamiltonian of a harmonic oscillator 


H=3>) (i + @). 


Then the momentum mapping is simply H: R2" > R, a nonzero momen- 
tum level manifold is a sphere S?"~', and the quotient space is the complex 
projective space CP"~}. 

The preceding theorem defines a symplectic structure on this complex 
projective space. It is easy to verify that this structure coincides (up to a 
multiple) with the one we constructed in Appendix 3. 


EXAMPLE 2. Let V be the cotangent bundle of a Lie group, G the same group 
and the action defined by left translation. Then M, is a submanifold of the 
cotangent bundle of G, formed by those vectors which, after right translation 
to the identity of the group, define the same element in the dual space to the 
Lie algebra. 

The manifolds M, are diffeomorphic to the group itself and are right- 
invariant cross-sections of the cotangent bundle. All the values p are regular. 

The stationary subgroup G, of the point p consists of those elements of 
the group for which left and right translation of p give the same result. The 
actions of elements different from the identity of G, on M, have no fixed 
points (since there are none by right translation of the group onto itself). 

The group G, acts properly (cf. remark above). Consequently, the space 
of orbits of the group G, on M, is a symplectic manifold. 

But this space of orbits is easily identified with the orbit of the point p 
in the co-adjoint representation. Actually, we map the right-invariant 
section M,, of the cotangent bundle into the cotangent space to the group at 
the identity with left translations. We get a mapping 


nm: M, > 9*. 


The image of this mapping is the orbit of the point p in the co-adjoint 
representation, and the fibers are the orbits of the action of the group G,. 
The symplectic structure of the reduced phase space thus defines a symplectic 
structure in the orbits of the co-adjoint representation. 

It is not hard to verify by direct calculation that this is the same structure 
which we discussed in Appendix 2. 


EXAMPLE 3. Let the group G = S', the circle, and let it act without fixed 
points on a manifold V. Then there is an action of the circle on the cotangent 
bundle M = T*V. We can define momentum level manifolds M, (of co- 
dimension | in M) and quotient manifolds F,, (the dimension of which is 2 
less than the dimension of M). 
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In addition, we can construct a quotient manifold of the configuration 
space V by identifying the points of each orbit of the group on V. We denote 
this quotient manifold by W. 


Theorem. The reduced phase space F, is symplectic and diffeomorphic to the 
cotangent bundle of the quotient configuration manifold W. 


ProoF. Let 2: V > W be the factorization map, and we T*W a 1-form on W at the point w = av. 
The form x*w on V at the point v belongs to Mj and projects to a point in the quotient Fo. 
Conversely, the elements of Fy are the invariant 1-forms on V which are cqual to zero on the 
orbits; they define 1-forms in W. We have constructed a mapping T*W — Fo; it is easy to see 
that this is a symplectic diffeomorphism. 

The case p # 0 is reduced to the case p = 0 as follows. Consider a riemannian metric on 
V, invariant with respect to G. The intersection of M, with the cotangent plane to V at the point v 
is a hyperplane. The quadratic form defined by the metric has a unique minimum point S(v) in 
this hyperplane. Subtraction of the vector S(v) carries the hyperplane M,7T*YV, into 
My © T*V,, and we obtain a possibly nonsymplectic diffeomorphism F, > Fo. 

The difference between the symplectic structures on T*W induced by that of F, and Fo isa 
2-form, induced by a 2-form on W. O 


C Applications to the study of stationary rotations 
and bifurcations of invariant manifolds 


Suppose that we are given a Poisson action of a group G on a symplectic 
manifold M; let H be a function on M invariant under G. Let F,, be a reduced 
phase space (we assume that the conditions under which this can be defined 
are satisfied). 

The hamiltonian field with hamiltonian function H is tangent to every 
momentum level manifold M, (since momentum is a first integral). The 
induced field on M, is invariant with respect to G, and defines a field on the 
reduced phase space F,. This vector field on F, will be called the reduced 
field. 


Theorem. The reduced field on the reduced phase space is hamiltonian. The 
value of the hamiltonian function of the reduced field at any point of the 
reduced phase space is equal to the value of the original hamiltonian function 
at the corresponding point of the original phase space. 


Proor. The relation defining a hamiltonian field X, with hamiltonian H on a manifold M 
with form w 


dH(é) = w(é, Xy)_ for every € 


implies an analogous relation for the reduced field in view of the definition of the symplectic 
structure on F,. O 


ExamPLe. Consider an asymmetric rigid body, fixed at a stationary point, 
under the action of the force of gravity (or any potential force symmetric 
with respect to the vertical axis). 
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The group S! of rotations with respect to a vertical line acts on the con- 
figuration space SO(3). The hamiltonian function is invariant under rota- 
tions, and therefore we obtain a reduced system on the reduced phase space. 

The reduced phase space is, in this case, the cotangent bundle of the 
quotient configuration space (cf. Example 3 above). Factorization of the 
configuration space by the action of rotations around the vertical axis was 
done by Poisson in the following way. 

We will specify the position of the body by giving the position of an ortho- 
normal frame (e,, €,, €3). The three vertical components of the basic vectors 
give a vector in three-dimensional euclidean space. The length of this vector 
is 1 (why?). This Poisson vector!°? y determines the original frame up to 
rotations around a vertical line (why?). 

Thus the quotient configuration space is represented by a two-dimensional 
sphere S?, and the reduced phase space is the cotangent bundle T*S? with a 
nonstandard symplectic structure. The reduced hamiltonian function on the 
cotangent bundle is represented as the sum of the “kinetic energy of the 
reduced motion,” which is quadratic in the cotangent vectors, and the 
“effective potential” (the sum of the potential energy and the kinetic energy of 
rotation around a vertical line). 


The transition to the reduced phase space in this case is almost by “elimination of the cyclic 
coordinate @.” The difference is that the usual procedure of elimination requires that the con- 
figuration or phase space be a direct product by the circle, whereas in our case we have only a 
bundle. This bundle can be made a direct product by decreasing the size of the configuration 
space (i.e., by introducing coordinates with singularities at the poles); the advantage of the 
approach above is that it makes it clear that there are no real singularities (except singularities 
of the coordinate system) near the poles. 


Definition. The phase curves in M which project to equilibrium positions in 
the reduced system on the reduced phase space F,, are called the relative 
equilibria of the original system. 


EXAMPLE. Stationary rotations of a rigid body which is fixed at its center of 
mass are relative equilibria. In the same way, rotations of a heavy rigid body 
with constant speed around the vertical axis are relative equilibria. 


Theorem. A phase curve of a system with a G-invariant hamiltonian function is a 
relative equilibrium if and only if it is the orbit of a one-parameter subgroup 
of G in the original phase space. 

Proor. It is clear that a phase curve which is an orbit projects to a point. If a phase curve x(t) 


projects to a point, then it can be expressed uniquely in the form x(t) = g(t)x(0), and it is then 
easy to see that {g(t)} is a subgroup. O 


102 Poisson showed that the equations of motion of a heavy rigid body can be written in terms 
of y in a remarkably simple form, the “Euler-Poisson equations”: 


dM 


d 
G7 Mel =v. = C1.) 
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Corollary 1. An asymmetrical rigid body in an axially symmetric potential 
field, fixed at a point on the axis of the field, has at least two stationary 
rotations (for every value of the angular momentum with respect to the axis 
of symmetry). 


Corollary 2. An axially symmetric rigid body fixed at a point on the axis of 
symmetry, has at least two stationary rotations (for every value of the angular 
momentum with respect to the axis of symmetry). 


Both corollaries follow from the fact that a function on the sphere has at 
least two critical points. 

Another application of relative equilibria is that they can be used to 
investigate modifications of the topology of invariant manifolds under 
changes of the energy and momentum values. 


Theorem. The critical points of the momentum and energy mapping 
Px H:M—->g9*xR 


on a regular momentum level set are exactly the relative equilibria. 


ProoF. The critical points of the mapping P x H are the conditional extrema of H on the 
momentum level manifold M, (since this level manifold is regular, i.e., for every x in M,, we 
have P,TM, = Tg). 

After factorization by G,, the conditional extrema of H on M, define the critical points of 
the reduced hamiltonian function (since H is invariant under G,). oO 


The detailed study of relative equilibria and singularities of the energy- 
momentum mapping is not simple and has not been completely carried out, 
even in the classical problem of the motions of an asymmetrical rigid body 
in a gravitational field. The case when the center of gravity lies on one of the 
principal axes of inertia is treated in the supplement written by S. B. Katok 
to the Russian translation!°? of the article by S. Smale cited in the beginning 
of this appendix. In this problem the dimension of the phase space is six, and 
the group is the circle; the reduced phase space T*S? is four-dimensional. 

The nonsingular energy level manifolds in the reduced phase space are 
(depending on the values of momentum and energy) of the following four 
forms: S*, S? x S!, RP?, and a “pretzel” obtained from the three-sphere S* 
by attaching two “handles” of the form 


S'x D? — (D? = the disc {(x, y)|x? + y? < 1}). 


103 Uspekhi Matematicheskikh Nauk 27, no. 2 (1972) 78-133. 
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In this appendix we give a list of normal forms to which we can reduce a 
quadratic hamiltonian function by means ofa real symplectic transformation. 
This list was composed by D. M. Galin based on the work of J. Williamson 
in “On an algebraic problem concerning the normal forms of linear dynamical 
systems,” Amer. J. of Math. 58, (1936), 141-163. Williamson’s paper gives 
the normal forms to which a quadratic form in a symplectic space over any 
field can be reduced. 


A Notation 


We will write the hamiltonian as 
H = 3(Ax, x), 


where X = (P1,--+3 Dni Gis ---> Qn) 1S a Vector written in a symplectic basis 
and A is a symmetric linear operator. The canonical equations then have the 
form 

: 0 -E 

x = IAx, where = ce A 

By the eigenvalues of the hamiltonian we will mean the eigenvalues of the 
linear infinitesimally-symplectic operator IA. In the same way, by a Jordan 
block we will mean a Jordan block of the operator IA. 

The eigenvalues of the hamiltonian are of four types: real pairs (a, —a), 
purely imaginary pairs (ib, — ib), quadruples (+a +ib), and zero eigenvalues. 

The Jordan blocks corresponding to the two members of a pair or four 
members of a quadruple always have the same structure. 

In the case when the real part of an eigenvalue is zero, we have to dis- 
tinguish the Jordan blocks of even and odd order. There are an even number of 
blocks of odd order with zero eigenvalue and they can be naturally divided 
into pairs. 

A complete list of normal forms follows. 


B Hamiltonians 


For a pair of Jordan blocks of order k with eigenvalues +a, the hamiltonian 
is 
k k-1 
H = —ay pjqj+ ), Pidi+s- 
j=1 j=1 


For a quadruple of Jordan blocks of order k with eigenvalues +a + bi 
the hamiltonian is 
2k-2 


2k k 
H=-a Y P54; + bY (p25 192; — P2j42j-1) + ¥ Pj4j+2- 
j=l j=l j=1 
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For a pair of Jordan blocks of order k with eigenvalue zero the hamiltonian 
is 
k-1 


H=Y pjqjs1 (fork = 1, H = 0). 
j=l 


For a Jordan block of order 2k with eigenvalue zero, the hamiltonian is 
of one of the following two inequivalent types: 


1 k-1 k k-1 
H=+ 5 ( PjPe-j — aes) 7 » Pidi+1 
j=l j=l jai 


(for k = 1 this is H = +4q?). 
For a pair of Jordan blocks of odd order 2k + 1 with purely imaginary 
eigenvalues +bi, the hamiltonian is of one of the following two inequivalent 


types: 


1 k 
H=+ 5] 3 @*Paspon-2 + 42j42x-2j+2) 
j=l 


k+1 2k 
= > (b*p2j- 1P2K%-2j+3 + 42j- sta-a03) = > PjQj+i- 
j=l j=1 
For k = 0, H = +4(b?p? + q73). 

For a pair of Jordan blocks of even-order 2k with purely imaginary eigen- 
values +bi, the hamiltonian is of one of the following two inequivalent types: 

i eee gi 
H=+ 5 2 2 92j-192k-2j+1 + 92jI2n-2j+2 
J 


=1 


k-1 
= ys bP oj+1P2x-2j+1 + Prjeaas-21+2) 
j= 


j= 


k k 
at > P2j-192; + ¥ poj42j-1 
j=1 j=l 
1/1 , 2 2 
fork = 1,H = £5 (s54i + 42) — b’pida + Pods). 


Williamson’s theorem. 4 real symplectic vector space with a given quadratic 
form H can be decomposed into a direct sum of pairwise skew orthogonal real 
symplectic subspaces so that the form H is represented as a sum of forms of 
the types indicated above on these subspaces. 


C Nonremovable Jordan blocks 


An individual hamiltonian in “general position” does not have multiple 
eigenvalues and reduces to a simple form (all the Jordan blocks are of first 
order). However, if we consider not an individual hamiltonian but a whole 
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family of systems depending on parameters, then for some exceptional 
values of the parameters more complicated Jordan structures can arise. We 
can get rid of some of these by a small change of the family; others are non- 
removable and only slightly deformed after a small change of the family. If 
the number / of parameters of the family is finite, then the number of non- 
removable types in /-parameter families is finite. The theorem of Galin 
formulated below allows us to count all these types for any fixed I. 

We denote by n,(z) > n,(z) > --- > n,(z) the dimensions of the Jordan 
blocks with eigenvalues z #0, and by m, >m,>--->m, and m, > 
m, > +--+ > m, the dimensions of the Jordan blocks with eigenvalues zero, 
where the m; are even and the m, are odd (of every pair of blocks of odd 
dimension, only one is considered). 


Theorem. In the space of all hamiltonians, the manifold of hamiltonians with 
Jordan blocks of the indicated dimensions has codimension 


S(Z) 


c=5 72, px — Infz) - 1 +5 xc — 1)m; 


+ » [2(2j — 1m; + 1] + 2y s min{m,, m,}. 


j=1 k=1 


(Note that, if zero is not an eigenvalue, then only the first term in the sum 
is not zero.) 


Corollary. In I-parameter families in general position of linear hamiltonian 
systems, the only systems which occur are those with Jordan blocks such that 
the number c calculated by the formula above is not greater than |: all 
cases with larger c can be eliminated by a small change of the family. 


Corollary. In one- and two-parameter families, nonremovable Jordan blocks of 
only the following 12 types occur: 


1 = 1:(4a)’, (+ia)?, 0? 


(here the Jordan blocks are denoted by their determinants; for example, 
(+a)? denotes a pair of Jordan blocks of order 2 with eigenvalues a and 
—a, respectively; 


| = 2:(+a), (tai)’, (ta + bi)’, 0*, (4a)*(+b)?, (+ai)?(+bi)’, 
(+a)?(+bi)?, (+a)?0?, (+ai)?0? 


(the remaining eigenvalues are simple). 
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Galin has also computed the normal forms to which one can reduce any 
family of linear hamiltonian systems which depend smoothly on parameters, 
by using a symplectic linear change of coordinates which depends smoothly 
on the parameters. For example, for the simplest Jordan square (+a), the 
normal form of the hamiltonian will be 


H(A) = —a(piqi + P242) + Pid2 + Arpids + A2P241 


(A, and A, are the parameters). 
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In studying the behavior of solutions to Hamilton’s equations near an 
equilibrium position, it is often insufficient to look only at the linearized 
equation. In fact, by Liouville’s theorem on the conservation of volume, 
it is impossible to have asymptotically stable equilibrium positions for hamil- 
tonian systems. Therefore, the stability of the linearized system is always 
neutral: the eigenvalues of the linear part of a hamiltonian vector field at a 
stable equilibrium position all lie on the imaginary axis. 

For systems of differential equations in general form, such neutral 
stability can be destroyed by the addition of arbitrarily small nonlinear 
terms. For hamiltonian systems the situation is more complicated. Suppose, 
for example, that the quadratic part of the hamiltonian function at an 
equilibrium position (which determines the linear part of the vector field) is 
(positive or negative) definite. Then the hamiltonian function has a maximum 
or minimum at the equilibrium position. Therefore, this equilibrium position 
is stable (in the sense of Liapunov, but not asymptotically), not only for the 
linearized system but also for the entire nonlinear system. 

On the other hand, the quadratic part of the hamiltonian function at a 
stable equilibrium position may not be definite. A simple example is supplied 
by the function H = p? + qi — p3 — q3. To investigate the stability of 
systems with this kind of quadratic part, we must take into account terms of 
degree >3 in the Taylor series of the hamiltonian function (ie., the terms of 
degree >2 for the phase velocity vector field). It is useful to carry out this 
investigation by reducing the hamiltonian function (and, therefore, the 
hamiltonian vector field) to the simplest possible form by a suitable canonical 
change of variables. In other words, it is useful to choose a canonical co- 
ordinate system, near the equilibrium position, in which the hamiltonian 
function and equations of motion are as simple as possible. 

The analogous question for general (non-hamiltonian) vector fields can 
be solved easily: there the general case is that a vector field in a neighborhood 
of an equilibrium position is linear in a suitable coordinate system (the 
relevant theorems of Poincaré and Siegel can be found, for instance, in the 
book, Lectures on Celestial Mechanics, by C. L. Siegel and J. Moser, 
Springer-Verlag, 1971.) 

In the hamiltonian case the picture is more complicated. The first difficulty 
is that reduction of the hamiltonian field to a linear normal form by a 
canonical change of variables is generally not possible. We can usually kill 
the cubic part of the hamiltonian function, but we cannot kill all the terms of 
degree four (this is related to the fact that, in a linear system, the frequency of 
oscillation does not depend on the amplitude, while in a nonlinear system it 
generally does). This difficulty can be surmounted by the choice of a nonlinear 
normal form which takes the frequency variations into account. As a result, 
we can (in the “non-resonance” case) introduce action-angle variables near 
an equilibrium position so that the system becomes integrable up to terms of 
arbitrary high degree in the Taylor series. 
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This method allows us to study the behavior of systems over the course of 
large intervals of time for initial conditions close to equilibrium. However, 
it is not sufficient to determine whether an equilibrium position will be 
Liapunov stable (since on an infinite time interval the influence of the dis- 
carded remainder term of the Taylor series can destroy the stability). Such 
stability would follow from an exact reduction to an analogous normal form 
which did not disregard remainder terms. However, we can show that 
this exact reduction is generally not possible, and formal series for canonical 
transformations reducing a system to normal form generally diverge. 

The divergence of these series is connected with the fact that reduction 
to normal form would imply simpler behavior of the phase curves (they 
would have to be conditionally periodic windings of tori) than that which 
in fact occurs. The behavior of phase curves near an equilibrium position is 
discussed in Appendix 8. In this appendix we give the formal results on nor- 
malization up to terms of high degree. 

The idea of reducing hamiltonian systems to normal forms goes back to 
Lindstedt and Poincaré;!°* normal forms in a neighborhood of an equi- 
librium position were extensively studied by G. D. Birkhoff (G. D. Birkhoff, 
Dynamical Systems, American Math. Society, 1927). 

Normal forms for degenerate cases can be found in the work of A. D. 
Bruno, “Analytic forms of differential equations,” (Trudy Moskovskogo 
matematicheskogo obshchestva, v. 25 and v. 26). 


A Normal form of a conservative system near an 
equilibrium position 


Suppose that in the linear approximation an equilibrium position of a 
hamiltonian system with n degrees of freedom is stable, and that all n charac- 
teristic frequencies w,,..., @, are different. Then the quadratic part of the 
hamiltonian can be reduced by a canonical linear transformation to the 
form 


H = 3(@, (pi + qi) + +--+ 4@,(pa + Ga))- 


(Some of the numbers w, may be negative). 


Definition. The characteristic frequencies w,,...,@, Satisfy a resonance 

relation of order K if there exist integers k, not all equal to zero such that 
ko, + ---+k,o, = 90, |k,| +--- + |k,| = K. 

Definition. A Birkhoff normal form of degree s for a hamiltonian is a poly- 


nomial of degree s in the canonical coordinates (P,, Q,) which is actually 
a polynomial (of degree [s/2]) in the variables t, = (P? + Q?)/2. 


104 Cf, H. Poincaré, Les Méthodes Nouvelles de la Mécanique Céleste, Vol. 1, Dover, 1957. 
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For example, for a system with one degree of freedom the normal form of degree 2m(or 2m + 1) 
looks like 


Ham = Hams, = tT + gt? +++ + qt, tT = (P? + Q’)/2, 
and for a system with two degrees of freedom the Birkhoff normal form of degree 4 will be 
Ag = ait, + yt, + ayy] + ay2 TT, + A227}. 


The coefficients a; and a, are characteristic frequencies, and the coefficients a;; describe the 
dependence of the frequencies on the amplitude. 


Theorem. Assume that the characteristic frequencies w, do not satisfy any 
resonance relation of order s or smaller. Then there is a canonical co- 
ordinate system in a neighborhood of the equilibrium position such that 
the hamiltonian is reduced to a Birkhoff normal form of degree s up to terms 
of order s + 1: 


H(p,q)=H¢(P,Q)+R R=O(lP|+/Qlh*’. 


Proor. The proof of this theorem is easy to carry out in a complex coordinate system 
2 = Pi t+ ig Wi = Pi — iq 


(upon passing to this coordinate system we must multiply the hamiltonian by —2i). If the terms 
of degree less than N entering into the normal form are not already killed, then the transformation 
with generating function Pg + S,(P, q) (where Sy is a homogeneous polynomial of degree N) 
changes only terms of degree N and higher in the Taylor expansion of the hamiltonian function. 

Under this transformation the coefficient for a monomial of degree N in the hamiltonian 
function having the form 


ZB Zieh whe (ay tos + Oy + By +o + By = N) 
is changed into the quantity 
SaplAy(By — a) + +++ + AB, — %)).- 


where A, = iw, and where s,, is the coefficient for z*w’ in the expansion of the function S,(P, q) 
in the variables z and w. 

Under the assumptions about the absence of resonance, the coefficient of s,, in the square 
brackets is not zero, except in the case when our monomial can be expressed in terms of the 
product z,w, = 2t, (i.e., when all the «, are equal to the f,). Thus we can kill all terms of degree N 
except those expressed in terms of the variables t,. Setting N = 3,4,...,s, we obtain the theorem. 


O 


To use Birkhoff’s theorem, it is helpful to note that a hamiltonian in normal 
form is integrable. Consider the “canonical polar coordinates” 1,, @,, in 
which P, and Q, can be expressed by the formulas 


P, = ,/2t, cos 9; QO, = ./27,sin @. 


Since the hamiltonian is expressed in terms of only the action variables 1,, 
the system is integrable and describes conditionally periodic notions on the 
torit = const with frequencies w = 0H/dt. In particular, the equilibrium 
position P = Q = Ois stable for the normal form. 
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B Normal form of a canonical transformation near a stationary point 


Consider a canonical (i.e. area-preserving) mapping of the two-dimensional 
plane to itself. Assume that this transformation leaves the origin fixed, and 
that its linear part has eigenvalue A = e*'* (i.e., is a rotation by angle « ina 
suitable symplectic basis with coordinates p, q). We will call such a trans- 
formation elliptic. 


Definition. A Birkhoff normal form of degree s for a transformation is a canon- 
ical transformation of the plane to itself which is a rotation by a variable 
angle which is a polynomial of degree not more than m = [s/2] — 1 
in the action variable t of the canonical polar coordinate system: 


(1, 9) > (T,P + % + OT +++ + mT”), 


p= /2tcosp q=./2tsin g. 


Theorem 2. If the eigenvalue 4 of an elliptic canonical transformation is not a 
root of unity of degree s or less, then this transformation can be reduced by a 
canonical change of variables to a Birkhoff normal form of degree s with 
error terms of degree s + 1 and higher. 


where 


The multi-dimensional generalization of an elliptic transformation is the 
direct product of n elliptic rotations of the planes (p,, q,) with eigenvalues 
A, = e*™™. A Birkhoff normal form of degree s is given by the formula 


as 
(t, @) > Le +a), 


where S is a polynomial of degree not more than [s/2] in the action variables 
yikes Tas 


Theorem 3. If the eigenvalues A, of a multi-dimensional elliptic canonical 
transformation do not admit resonances 


Ap edge = 1, [ky] +--+ [kyl <s, 
then this transformation can be reduced to a Birkhoff normal form of degree s 


(with error in terms of degree s in the expansion of the mapping in a Taylor 
series at the point p = q = 0). 


C Normal form of an equation with periodic coefficients 
near an equilibrium position 


Let p = q = 0 be an equilibrium position of a system whose hamiltonian 
function depends 27-periodically on time. Assume that the linearized equa- 
tion can be reduced by a linear symplectic time-periodic transformation to an 
autonomous normal form with characteristic frequencies w,,..., @,. 
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We say that a system is resonant of order K > 0 if there is a relation 
ko, + Sos + k,Q@, + ko =0 
with integers kg, k,,...,k, for which |k,| + --- + [k,| = K. 


Theorem. If a system is not resonant of order s or less, then there is a 2n- 
periodic time-dependent canonical transformation reducing the system in a 
neighborhood of an equilibrium position to the same Birkhoff normal form 
of degree s as if the system were autonomous, with only the difference that 
the remainder terms R of degree s + 1 and higher will depend periodically 
on time. 


Finally, suppose that we are given a closed trajectory of an autonomous 
hamiltonian system. Then, in a neighborhood of this trajectory, we can 
reduce the system to normal form by using either of the following two 
methods: 


1. Isoenergetic reduction: Fix an energy constant and consider a neighbor- 
hood of the closed trajectory on the (2n — 1)-dimensional energy level 
manifold as the extended phase space of a system with n — 1 degrees of 
freedom, periodically depending on time. 

2. Surface of section: Fix an energy constant and value of one of the co- 
ordinates (so that the closed trajectory intersects the resulting (2n — 2)- 
dimensional manifold transversally). Then phase curves near the given 
one define a mapping of this (2n — 2)-dimensional manifold to itself, 
with a fixed point on the closed trajectory. This mapping preserves the 
natural structure on our (2n — 2)-dimensional manifold, and we can 
study it by using the normal form in Section B. 


In investigating closed trajectories of autonomous hamiltonian systems, 
a phenomenon arises which contrasts with the general theory of equilibrium 
positions of systems with periodic coefficients. The fact is that the closed 
trajectories of an autonomous system are not isolated, but form (as a rule) 
one-parameter families. The parameter of the family is the value of the energy 
constant. In fact, assume that for some choice of the energy constant the 
closed trajectory intersects transversally the (2n — 2)-dimensional manifold 
described above in the (2n — 1)-dimensional energy level manifold. Then 
for nearby values of the energy, there will exist a similar closed trajectory. 
By the implicit function theorem we can even say that this closed trajectory 
depends smoothly on the energy constant. 

If we now wish to use the Birkhoff normal form to investigate a one- 
parameter family of closed trajectories, we encounter the following difficulty. 
As the parameter describing the family varies, the eigenvalues of the linearized 
problem will generally change. Therefore, for some values of the parameter 
we will inevitably encounter resonances, obstructing reduction to the normal 
form. 
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Especially dangerous are resonances of low order, since they influence 
the first few terms of the Taylor series. If we are interested in a closed trajectory 
for which the eigenvalues nearly satisfy a resonance relation of low order, 
then the Birkhoff form must be somewhat modified. Namely, for resonance 
of order N some of the expressions 


ko — [a (B, — Oy) +--+ OCB, — O)); lo| + |BI=N, 


by which we must divide to kill the terms of order N in the hamiltonian 
function, may become zero. For non-resonant values of the parameter which 
are close to resonance, this combination of characteristic frequencies is 
generally not zero, but very small (this combination is therefore called a 
“small denominator”). 

Division by a small denominator leads to the following difficulties: 


1. The transformation which reduces to normal form depends discon- 
tinuously on the parameter (it has poles for resonant values of the param- 
eter); 

2. The region in which the Birkhoff normal form accurately describes the 
system contracts to zero at resonance. 


In order to get rid of these deficiencies, we must give up trying to annihilate 
some of the terms of the hamiltonian (namely, those which become resonant 
for resonance values of the parameter). Moreover, these terms must be 
preserved not only for resonance, but also for nearby values of the param- 
eter.!°° The normal form thus obtained is somewhat more complicated than 
the usual normal form, but in many cases it gives us useful information on 
the behavior of solutions near resonance. 


D Example: Resonance of order 3 


As a simple example, we will study what happens to a closed trajectory of an 
autonomous hamiltonian system with two degrees of freedom, for which 
the period of oscillation (about the closed trajectory) of neighboring trajec- 
tories is three times the period of the closed trajectory itself. By what we said 
above, this problem may be reduced to an investigation of a one-parameter 
system of non-autonomous hamiltonian systems with one degree of freedom, 
2n-periodically depending on time, in a neighborhood of an equilibrium 
position. This equilibrium position can be taken as the origin for all values of 
the parameter (to achieve this we must make a change of variables depending 
on the parameter). 

Furthermore, the linearized system at the equilibrium position can be 
converted into a linear system with constant coefficients by a 27-periodically 
time-dependent linear canonical change of variables. In the new coordinates 
the phase flow of the linearized system is represented as a uniform rotation 


105 The method indicated here is useful not only in investigating hamiltonian systems, but also 
in the general theory of differential equations. Cf., for example, V. I. Arnold, “Lectures on 
bifurcations and versal families,” Russian Math. Surveys 27, No. 5, 1972, 54-123. 
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around the equilibrium position. The angular velocity w of this rotation 
depends on the parameter. 

At the resonance value of the parameter, w = 4 (ie., after time 27, we have 
gone one-third of the way around the origin). The derivative of the anguiar 
velocity @ with respect to the parameter is generally not zero. Therefore, we 
can take as a parameter this angular velocity or, even better, its difference 
from 3. We will denote this difference by ¢. The quantity ¢ is called the 
frequency deviation or detuning. The resonance value of the parameter is 
é = 0, and we are interested in the behavior of the system for small «. 

If we disregard the nonlinear terms in Hamilton’s equations and dis- 
regard the frequency deviation ¢, then all trajectories of our system become 
closed after making three revolutions (i.e., they have period 62). We now 
want to study the influence of the nonlinear terms and frequency deviation 
on the behavior of the trajectories. It is clear that in the general case not all the 
trajectories will be closed. To study their behavior, it is useful to look at 
the normal form. 

In the chosen coordinate system, z = p + ig, Z = p — iq, the hamiltonian 
function has the form 


—2iH = -iwzz+ y hapa z"Ze™ + ++, 
at+Bp=3 k=-o 
where the dots indicate terms of order higher than three, and where w = 
4) +6. 
In the reduction to normal form we can kill all terms of degree three 
except those for which the small denominator 


oa — B) +k 


becomes zero at resonance. These terms can be described also as those 
which are constant along trajectories of the periodic motion obtained by 
disregarding the frequency deviation and nonlinearity. They are called the 
resonant terms. Thus, for resonance w = 4, the resonant terms are those for 
which 

a—B+ 3k=0. 


Of the terms of third order, only z3e~" and Ze turn out to be resonant. 
Thus we can reduce the hamiltonian function to the form 


—2iH = —iwzzZ + hze7* — Azet +... 


(the conjugacy of h and h corresponds to the fact that H is real). 

Note that, in order to reduce the hamiltonian function to this normal 
form, we made a 27-periodic time-dependent smooth canonical transforma- 
tion which depends smoothly on the parameter, even in the case of resonance. 
This transformation differs from the identity only by terms that are small of 
second order relative to the deviation from the closed trajectory (and its 
generating function differs from the generating function of the identity only 
by cubic terms). 
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Further investigation of the behavior of solutions of Hamilton’s equations 
proceeds in the following way. First, we throw out of the hamiltonian function 
all terms of order higher than three and study the solutions of the resulting 
truncated system. Then we must see how the discarded terms can affect the 
behavior of the trajectories. 

The study of the truncated system can be simplified by introducing a 
coordinate system in the complex z-plane which rotates uniformly with 
angular velocity 4, i.e., by the substitution z = Ce"/>. Then for the variable ¢ 
we obtain an autonomous hamiltonian system with hamiltonian function 


—2iH) = —ie(6 +h€? —hE>, = where e = w — (4). 


The fact that, in a rotating coordinate system, the truncated system is autonomous is very 
good luck. The total system of Hamilton’s equations (including terms of degree higher than three 
in the hamiltonian) is not only not autonomous in a rotating coordinate system, but is not 
even 2z-periodic (but only 67-periodic) in time. The autonomous system with hamiltonian Ho 
is essentially the result of averaging the original system over closed trajectories of the linear 
system with ¢ = 0 (where we disregard terms of degree higher than three). 


The coefficient h can be made real (by a rotation of the coordinate system). 
Thus the hamiltonian function in the real coordinates (x, y) is reduced to 
the form 


Ho = 5 (x? + y”) + a(x? — 3xy’). 


The coefficient a depends on the frequency deviation € as on a parameter. 
For ¢ = 0 this coefficient is generally not zero. Therefore, we can make this 
coefficient equal to 1 by a smooth change of coordinates depending on a 
parameter. Thus we must investigate the dependence on the small parameter 
é of the phase portrait of the system with hamilton function 


Hg = 5 (x? + y*) + (x3 ~ 3xy’) 


in the (x, y)-plane. 
It is easy to see that this dependence consists of the following (Fig. 239). 


Figure 239 Passage through resonance 3:1 
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For ¢ = 0 the zero level set of the function H, consists of three straight lines 
through 0, intersecting at angles of 60°. Under a change of ¢ the level line 
always consists of three straight lines, where these three lines are moved 
forward as ¢ changes, always forming an equilateral triangle with center at 
the origin. The vertices of this triangle are saddle points of the hamiltonian 
function. As ¢ passes through zero (i.e., upon passage through resonance), 
the critical point at the origin changes from a minimum to a maximum. 

Thus, for a system with hamiltonian function Ho, the origin is a stable 
equilibrium position for all values of the parameter except at resonance, 
and at resonance the origin is unstable. For values of the parameter close to 
resonance, the triangle close to the origin filled by closed phase curves is 
small (of order ¢), so the “radius of stability” of the origin approaches zero as 
é + 0: a small (of order ¢) perturbation of the initial condition is sufficient 
to make a phase point move outside the triangle and begin to go away from 
the equilibrium position. 

Returning to the original problem of the periodic trajectory, we come to 
the following conclusions (which, of course, are not proven, since we threw 
out terms of degree higher than three, but which can be justified): 


1. At the moment of passage through the resonance 3 : 1 a periodic trajectory 
generally loses its stability. 

2. For values of the parameter close to resonance there is an unstable periodic 
trajectory near the periodic trajectory under consideration on the same 
energy level manifold. It is closed after making three circulations along 
the original trajectory and one revolution around it. For the resonance 
value of the parameter, this unstable trajectory merges with the original 
one. 

3. The distance of this unstable periodic trajectory from the original 
decreases, as we approach resonance, to first order in the frequency 
deviation (i.e., as the first order of the difference of the parameter from the 
resonance value). 

4. Through this unstable trajectory on the same three-dimensional energy 
level manifold there pass two two-dimensional invariant surfaces, 
filled with trajectories approximating this unstable periodic trajectory 
as t > 00 on one surface and as t > — oo on the other. 

5. The location of the separatrices is such that, by intersecting with a mani- 
fold transversal to the original trajectory, we obtain a figure close to the 
three sides of an equilateral triangle and their continuations. The vertices 
of the triangle are the points of intersection of the unstable periodic 
trajectory with the transversal manifold. 

6. For initial conditions inside the triangle formed by the separatrices, a 
phase point stays near the original periodic trajectory (at a distance of 
order €) for a long time (of order not less than 1/s), and for initial conditions 
outside the triangle it goes off quite rapidly to a distance which is large in 
comparison with «. 
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E Splitting of separatrices 


In reality, the separatrices we talked about in statements 4, 5, and 6 above 
have a very complicated structure (because of the influence of the terms 
of order higher than three which we disregarded in our approximation). In 
order to understand the situation, it is convenient to look at a two-dimen- 
sional surface transversally intersecting the original closed trajectory at 
some point on it (and lying entirely in one energy level manifold).!°° Trajec- 
tories beginning on this surface intersect it again after a time close to the 
time of circulation around the original closed trajectory. Thus we have a 
mapping of a neighborhood of the point of intersection of the closed trajec- 
tory with the surface onto a part of the surface. This mapping has a fixed 
point (at the point where the closed trajectory intersects the surface) and is 
approximately a rotation by 120° around this point, which we take for the 
origin in our surface. 

We now consider the third power of the mapping indicated above. This 
is again a mapping of some neighborhood of the origin to a part of the sur- 
face, leaving the origin fixed. But now this mapping is approximately rotation 
by 360°, i.e., the identity: it is realized by the trajectories of our system after 
approximately three periods of our closed trajectory. 

The calculations above give nontrivial information about the structure 
of this “mapping after three periods.” In fact, by throwing out the terms of 
degree four and higher in the hamiltonian function, we change the terms of 
degree three and higher of the mapping. Therefore, the mapping after three 
periods which corresponds to the truncated hamiltonian function approxi- 
mates (with cubic error) the actual mapping after three periods. 

But we know the properties of the mapping after three periods correspond- 
ing to the truncated hamiltonian function, since it is the mapping of the 
phase flow of the system with hamiltonian function Ho(x, y) after time 
6x (the proof is based on the fact that after time 6x our rotating coordinate 
system returns to the original position). We now look at which of these 
properties are preserved for perturbations of third-order smallness relative 
to the distance from the fixed point, and which are not. 

We let A, denote the mapping after three periods for the truncated system, 
and A the actual mapping after three periods. 


1. The mapping Ag is included in a flow: it is the transformation after time 
6z in the phase flow with hamiltonian Hy. 
There is no reason to think that the mapping A is included in a flow. 
2. The mapping A, is symmetric under a rotation by 120°: there is a non- 
trivial diffeomorphism g for which g*? = E and which commutes with Ao. 
There is no reason to think that the mapping A commutes with any 
nontrivial diffeomorphism g satisfying g* = E. 
106 Here we have the following general phenomenon: it is easier to think about mappings after 
a period, and easier to calculate with flows. 
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3. The mapping A, has three unstable fixed points at a distance ¢ from the 
origin, approximately the vertices of an equilateral triangle. For sufficiently 
small deviations from resonance (i.e., for sufficiently small ¢) the mapping 
A also has three unstable fixed points near the vertices of an equilateral 
triangle. This follows from the implicit function theorem. 

4. The separatrices of fixed points of the mapping Aj form, for values of the 
parameter close to (but not at) resonance, a figure approximating the 
sides and extended sides of an equilateral triangle. If we begin with a 
point on one of the sides of the triangle, then after repeated applications 
of Ay we obtain a sequence of points on the same side of the triangle 
approaching one of the vertices bounding the side, say My. Applying 
Ao‘, we obtain a sequence approaching the other vertex, which we will 
denote by No. 


Each of the three unstable fixed points of the mapping A also has separa- 
trices approximating the sides ofa triangle (Figure 240). Namely, those points 
of the plane which approach the fixed point M after applying the mappings 
A",n— +00, form a smooth curve [’* invariant under A, passing through 
M and, near M, close to the side My No of the separatrices of Ay. The points 
which approach N after applications of A", where n > —o, form another 
smooth invariant curve I”, passing through N and also near M, No near 
No. 


Figure 240 Splitting of separatrices 


However the two curves [* and I~, both near the line My No, are not at 
all obligea to coincide. This is the phenomenon of splitting of separatrices, 
which accounts for the differing behavior of the trajectories of the truncated 
and total systems. 


The magnitude of the splitting of separatrices is exponentially small for small ¢; therefore 
it is easy to overlook the phenomenon of splitting in calculations in one or another scheme of 
“perturbation theory.” However, this phenomenon is very important in fundamental questions. 
For example, its existence immediately implies the divergence of the series in numerous versions 
of perturbation theory (since if the series converged, there would be no splitting). 

In general, the divergence of series in perturbation theory (while a good approximation is 
given by a few initial terms) is usually related to the fact that we are looking for an object which 
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does not exist. If we try to fit a phenomenon to a scheme which actually contradicts the essential 
features of the phenomenon, then it is not surprising that our series diverge. 

The Birkhoff series (which are obtained if one continues infinitely the normalizations of 
the initial terms of the Taylor series of the hamiltonian function) are one example of a formally 
convergent, but actually divergent, scheme of perturbation theory. If these series converged, 
then a general oscillating system with one degree of freedom with periodic coefficients would be 
reduced near an equilibrium position to an autonomous normal form and there would be no 
splitting of separatrices in it (whereas in fact there is). 


Returning to the original closed trajectory, we see that the three unstable 
fixed points of the mapping A correspond to an unstable closed trajectory 
near the original triple. There is a family of trajectories approaching this 
unstable trajectory as t— +00, and another family of trajectories ap- 
proaching the unstable one as t + ~— oo. The points of the trajectories of 
each of these families form a smooth surface containing our unstable trajec- 
tory. 

These two surfaces are also the separatrices we talked about in state- 
ments 4, 5, and 6 of Section D. By intersecting them with our transversal 
surface we obtain the invariant curves [* and I~ of the mapping A. The 
intersections of these two curves form a complicated network about which 
H. Poincaré, who first discovered the phenomenon of splitting of separatrices, 
wrote, “The intersections form a type of lattice, tissue, or grid with infinitely 
fine mesh. Neither of the two curves must ever cut across itself again, but 
it must bend back upon itself in a very complex manner in order to cut 
across all of the squares in the grid an infinite number of times. 

“One will be struck by the complexity of this figure, which I shall not even 
attempt to draw. Nothing is more suitable for providing us with an idea of 
the complex nature of the three-body problem, and of all the problems of 
dynamics in general, where there is no uniform integral and where the Bohlin 
series are divergent.” (H. Poincaré, “Les Méthodes Nouvelles de la Mécan- 
ique Céleste,” Vol. III, Dover, 1957, 389.) 

We should note that much is still unclear about the picture of intersecting 
separatrices. 


F Resonances of higher order 


Resonances of higher order can also be studied using a normal form. In 
this connection, we note that resonances of order higher than 4 do not 
usually induce instability, since in the normal form terms of degree 4 appear, 
guaranteeing a minimum or maximum of the function H, even at resonance. 

In the case of resonance of order n > 4, the typical development of the 
phase portrait of the system with hamiltonian function Hg is given by the 
formula 


Hy = ét + t7a(t) + at”? sin ng, 
2t=p?+q’?, a0)= +41, 


and consists of the following (Figure 241). 
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Figure 241 Averaged hamiltonian of phase oscillations near resonance 5: 1 


For small (of order ¢) deviations of the frequency from resonance, and at 


a small (of order Jlel ) distance from the equilibrium position at the origin, 
the function Hy has 2n critical points near the vertices of a regular n-gon 
with center at the origin. Half of these critical points are saddle points, 
and the other half are maxima if the origin is a minimum or minima if the 
origin is a maximum. The saddle points and stable points alternate. All n 
saddle points lie on one level of the function H,; their separatrices, con- 
necting successive saddle points, form n “islands,” each of which is filled 
with closed phase curves encircling a stable point. The width of the islands 
is of order e“/)~(/), The closed phase curves inside each island are called 
“phase oscillations” (since what varies essentially is the phase of the oscilla- 
tions around the origin). The period of the phase oscillations grows with 
decreasing frequency deviation ¢ like e~"*. 

Inside the narrow ring formed by the islands, closer to the origin, there are 
closed phase curves encircling the origin; outside the ring the phase curves 
are closed, but motion along them proceeds in the direction opposite 
to that inside the ring. We note that the radius of the ring has order ef je 
independently of the order of resonance, if this order is greater than 4. Also, 
the ring of islands exists for only one of the two signs of «. 

If we pass from the truncated system with hamiltonian Ho to the total 
system, the separatrices split in a way similar to that described above for 
resonance of order 3. The size of the splitting of the separatrices is expo- 
nentially small (or order e~'/*""), but the splitting is of fundamental im- 
portance for investigating stability, especially in the multi-dimensional case. 

Returning to our original closed trajectory, we have the following picture. 
As we approach resonance along the ¢ axis from one side,!°’ two periodic 
trajectories split off from our periodic trajectory: a stable one and an un- 
stable one. These new trajectories close up after n circulations along the 


original trajectory and lie at a distance of order ,/|e| from the original 
trajectory. Near the stable trajectory there is a zone of slow phase oscillations 


‘°7 Unlike resonance of order 3, for which there is an unstable periodic trajectory branching 
off from both sides of the resonance. 
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with period of order e~”* and amplitude of order z/n in the azimuthal 


direction and of order e“/)—) in the radial direction. Loss of stability of the 
original periodic trajectory at the moment of passage through resonance 
does not occur, at least in the approximation which we have considered. 


The case of resonance of fourth order is somewhat exceptional. In this case, in the normal 
form there are both resonant and non-resonant terms of order 4. The shape of the phase curves 
of the truncated system depends on which of these terms of the normal form dominates, a 
resonant one or a non-resonant one. In the first case the development is the same as for third- 
order resonance, except that in place of a triangle there is a square. In the second case the develop- 
ment is the same as for n > 4. 


In conclusion, we remark that the given normal form becomes a better 
approximation as we get closer to resonance (e < 1) and as the deviation 
of the initial point from the periodic trajectory gets smaller. That is, as the 
period of the closed trajectory and the period of oscillation of neighboring 
trajectories near it become more exactly commensurable, and as the initial 
condition approaches the closed trajectory, the interval of time grows on 
which our approximation accurately describes the behavior of the phase 
curves. 

No conclusion about the behavior of non-closed phase curves on infinite 
intervals of time (for example, about the Liapunov stability of the original 
periodic trajectory) follows from our arguments, since the terms of higher 
order which were thrown out in reducing to normal form can, over an infinite 
period of time, completely change the character of the motion. Actually, 
under the conditions considered, the original periodic trajectory is Liapunov 
stable, but the proof requires substantially new techniques beyond the 
Birkhoff normal form (cf. Appendix 8). 
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periodic motion, and Kolmogorov’s theorem 

The collection of solvable “integrable” problems which we have at our 
disposal is not large (one-dimensional problems, motion of a point in a 
central field, eulerian and lagrangian motions of a rigid body, the problem of 
two fixed centers, and motion along geodesics on the ellipsoid). However, 
with the help of these “integrable cases,” we can obtain meaningful informa- 
tion about motions of many important systems by considering an integrable 
problem as a first approximation. 

An example of such a situation is the problem of motion of the planets 
around the sun under the law of universal gravitation. The mass of the planets 
is approximately 0.001 of the mass of the sun, so in a first approximation we 
can disregard the interaction of the planets on one another and consider 
only the attraction by the sun. As a result, we obtain the exactly integrable 
problem of the motion of non-interacting planets around the sun; each planet 
will describe its keplerian ellipse independently of the others, and the motion 
of the system as a whole will be conditionally periodic. If we now consider the 
interactions of the planets on one another, the keplerian motion of each 
planet will be slightly changed. 

We call upon the theory of perturbations from celestial mechanics to 
study this interaction. It is clear that calculations for time of the order of 
1,000 years do not present any fundamental difficulties. However, if we want 
to study longer intervals of time, and especially if we are interested in qualita- 
tive questions about the behavior of exact solutions of the equations of 
motion on an infinite time interval, then such difficulties arise. The ac- 
cumulation of perturbations after an interval of time which is large in 
comparison to 1,000 years could cause a complete change in the character of 
the motion: for example, the planets could fall into the sun, escape from it, or 
collide with one another. 


Note that the question of the behavior of solutions of the equations of motion on an infinite 
time interval has only an indirect relation to the problem of the motion of real planets. The 
reason is that, after intervals of billions of years, small non-conservative effects not considered 
in Newton’s equations become important. Thus, the effects of the gravitational interaction of 
the planets are of real importance only when they seriously change the picture of motion within a 
finite time which is small in comparison with the time of development of non-conservative 
effects. 

In calculating motion over such finite times, computers prove to be very useful, quickly 
determining the motion of the planets for many thousands of years in the future or past. How- 
ever, we should note that even the application of modern calculating methods may be insufficient 
to predict the influence of perturbations if a phase point falls in the zone of exponential in- 
stability. 

Asymptotic and qualitative methods have even greater value for the study of charged 
particles in magnetic fields, since in this situation a particle outstrips the computer and makes 
so many orbits that mechanical calculation of its trajectory is impossible even in the absence of 
exponential instability. 


A whole series of methods has been devised for calculating perturbations 
in celestial mechanics. (A detailed analysis of them can be found in the book, 
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“Les Méthodes Nouvelles de la Mécanique Celeste,’ by H. Poincaré, 
Dover, 1957.) . 

A difficulty with all of these methods is that they lead to divergent series 
and therefore give no information about the behavior of motion as a whole 
over infinite intervals of time. The reason for the divergence of series in the 
theory of perturbations is “small denominators”: integral linear combina- 
tions of frequencies of unperturbed motions by which it is necessary to divide 
in calculating the influence of perturbations. For exact resonance (i.e., for 
commensurable frequencies) these denominators vanish, and the cor- 
responding term of the series in the theory of perturbations becomes in- 
finitely large. Close to resonance, this term of the series is very large. 


Thus, for example, in their motion around the sun, Jupiter and Saturn, in one day, go through 
approximately 299 and 120.5 seconds of arc respectively. Therefore, the denominator 2, — Sws 
is very small in comparison with each of their frequencies. This amounts to a large long-period 
perturbation of the planets on one another (its period is about 800 years); the study by Laplace 
of this effect was one of the first successes of the theory of perturbations. 


We note that the difficulty caused by small denominators is essential. The 
rational numbers form a dense set; thus in the phase space of an unperturbed 
problem, initial conditions for which we have resonance and the small 
denominators vanish form a dense set. Hence, the functions given by the 
series of perturbation theory have a dense set of singular points. 

The difficulty mentioned here is characteristic not only for problems of 
celestial mechanics, but for all problems which are close to integrable (for 
instance, for the problem of an asymmetrical rigid top under very fast rota- 
tion). Poincaré himself called the problem of studying perturbations of 
conditionally-periodic motions in a system given by the hamiltonian 


H = H,(1) + éH,(, ~), é < 1, 


in action-angle variables I and g, the fundamental problem of dynamics. Here 
H, is the hamiltonian of the unperturbed problem, and ¢H, a perturbation 
which is a 27-periodic function of the angle variables @,,..., @,,. In the unper- 
turbed problem (¢ = 0) the angles g change uniformly with constant 
frequencies 


and all the action variables are first integrals. 
We must investigate the phase curves of Hamilton’s equations 
0H . OH 
0g vo ol 
in a phase space which is a direct product of a region in n-dimensional space 
with coordinates J and the n-dimensional torus with angular coordinates g. 


A substantial advance in the study of phase curves of this perturbed 
problem was begun in 1954 with the work of A. N. Kolmogorov in “On con- 
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servation of conditionally-periodic motions for a small change in Hamilton’s 
function,” Dokl. Akad. Nauk SSSR 98:4 (1954) 525-530 (Russian). In this 
appendix we present the basic results obtained since then in this area. The 
proofs can be found in the following works: 


V. I. Arnold, “Small denominators I, Mapping the circle onto itself,” Izv. Akad. Nauk SSSR 
Ser. Mat. 25 (1961), 21-86. 


V. I. Arnold, “Small denominators II, Proof of a theorem of A. N. Kolmogorov on the preserva- 
tion of conditionally periodic motions under a small perturbation of the Hamiltonian,” 
Russian Math. Surveys 18:5 (1963). 


V. I. Arnold, “Small denominators IIT. Small denominators and problems of stability of motion 
in classical and celestial mechanics.” Russian Math. Surveys 18:6 (1963). 


V. 1. Arnold, A. Avez, Ergodic problems of classical mechanics, New York, Benjamin, 1968. 


J. Moser, On invariant curves of area-preserving mappings of an annulus (Nachr. Akad. Wiss. 
G6ttingen, Math. Phys. KI] Ila, (1962) 1-20). 


J. Moser, A rapidly converging iteration method and nonlinear differential equations, (Annali 
della Scuola Norm. Sup. di Pisa, (3), 20 (1966), 265-315: (1966), 499-535. 


J. Moser, Convergent series expansions for quasi-periodic motions, Math. Ann. 169 (1967), 
136-176. 


C. L. Siegel, J. K. Moser, Lectures on Celestial Mechanics, Springer-Verlag, 1971. 
S. Sternberg, Celestial Mechanics, I, II, New York, Benjamin, 1969. 


Before formulating our results, we will briefly discuss the behavior of 
phase curves in the unperturbed problem already studied in Chapter 10. 


A Unperturbed motion 


The system with hamiltonian H,(J) has n first integrals in involution (the n 
action variables). Every level set of all these integrals is an n-dimensional 
torus in 2n-dimensional phase space. This torus is invariant with respect to 
the phase flow of the unperturbed system: every phase curve starting at a 
point of our torus remains on it. 

The motion of a phase point on the invariant torus J = const is condi- 
tionally-periodic. The frequencies of this motion are the derivatives of the 
unperturbed hamiltonian with respect to the action variables: 


@, = o,(1), where a, = ony. 
Ol, 

Therefore, the phase curve densely fills a torus whose dimension is equal 

to the number of frequencies w, which are arithmetically independent. 

We note that the frequencies depend on which torus we are looking at; 
i.e., which values of the first integrals we have fixed. A system of n functions 
w of n variables I is generally functionally independent; in such a case we 
can simply number the tori by their frequencies, choosing the variables w 
for coordinates in a neighborhood of the point under consideration in the 
space of action variables I. 
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The case when the frequencies are functionally independent will be called 

the nondegenerate case. The conditions for nondegeneracy have the form 

2 
ae oH #0. 
ol or? 
Thus, in the nondegenerate case, the unperturbed problem determines on the 
different invariant tori in phase space conditionally-periodic motions with 
different frequencies. In particular, the invariant tori on which the number of 
frequencies is maximal (i.e., n) form a dense set in phase space; such tori are 
called non-resonant tori. 

It can be shown that the non-resonant tori form a set of full measure, 
i.e., the Lebesgue measure of the union of all invariant resonant tori of the 
unperturbed non-degenerate system is equal to zero. Nevertheless, invariant 
resonant tori exist and are mixed in with the non-resonant tori in such a way 
that they too form a dense set. Furthermore, the set of resonant tori with any 
number of independent frequencies from 1 to n — 1 is dense. In particular, 
the invariant tori on which all phase curves are closed (the number of in- 
dependent frequencies is 1) form a dense set. Nevertheless, we note that the 
probability of landing on a resonant torus by a random choice of initial 
point in the phase space of the unperturbed system, is equal to zero (since the 
probability of landing on a rational number by a random choice of a real 
number is zero). Thus, by disregarding sets of measure zero, we can say that 
almost all invariant tori in a nondegenerate unperturbed system are non- 
resonant and have a total set of n arithmetically independent frequencies. 

On a non-resonant torus, the trajectory of a conditionally-periodic motion 
is dense. Thus, for almost all initial conditions, a phase curve of a non-de- 
generate unperturbed system densely fills an invariant torus whose dimension 
is equal to the number of degrees of freedom (i.e., half the dimension of the 
phase space). 

To better understand the whole picture, we consider the case of two 
degrees of freedom (n = 2). In this case, the phase space is four-dimensional 
so each energy level set is three-dimensional. We fix one such level set. This 
three-dimensional manifold, fibered by two-dimensional tori, can be repre- 
sented in ordinary three-dimensional space as a family of concentric tori 
lying inside one another (Figure 242). 


det = det 


Figure 242 Invariant tori in a three-dimensional energy level manifold 


402 


Appendix 8: Theory of perturbations of conditionally periodic motion 


The phase curves are windings of these tori; both frequencies of circulation 
change from torus to torus. In general, not only both frequencies but also 
their ratio will change from torus to torus. If the derivative of the ratio of 
frequencies with respect to the action variable numbering the tori on the 
given level set of the function Ho is not zero, then we say that our system is 
isoenergetically nondegenerate. The condition for isoenergetic nondegeneracy 
has (as is easy to calculate) the form 


2H, OH, 
aoa 

det ‘ #0. 
Ho 
ar °° 


The conditions for nondegeneracy and isoenergetic nondegeneracy are independent from 
one another; i.e., a nondegenerate system could be isoenergetically degenerate, and an iso- 
energetically nondegenerate system could be degenerate. In the many-dimensional case (n > 2) 
isoenergetic nondegeneracy means nondegeneracy of the following mapping of the (n — 1)- 
dimensional level manifold of the function Ho of n action variables to the projective space of 
dimension n — 1: 


T> (@,(1): ©): +++: @,(1)). 


Now consider an isoenergetically nondegenerate system with two degrees 
of freedom. It is easy to construct a two-dimensional plane in the three- 
dimentional energy level set transversally intersecting the two-dimensional 
tori of our family (in a family of concentric circles in the model in three- 
dimensional euclidean space). 

A phase curve beginning in such a plane returns to it after making a 
circuit around the torus. As a result we obtain a new point on the same circle 
in which the torus intersects the plane. In this way there arises a mapping of 
the plane to itself. 

This mapping of the plane to itself fixes the concentric meridian circles in 
which the plane intersects the invariant tori. Every circle is rotated through 
some angle, namely through that fraction of an entire revolution that the 
frequency along the meridian constitutes of the frequency along the equator. 

If the system is isoenergetically nondegenerate, the angle of revolution of 
invariant circles in the plane of intersection changes from one circle to 
another. Therefore, on some circles this angle will be commensurable with a 
whole revolution, and on others it will be incommensurable. Each of these 
classes of circles will form a dense set, but on almost all circles (in the sense of 
Lebesgue measure) the angle of rotation will be incommensurable with a 
whole revolution. 

The commensurability or incommensurability is manifested in the follow- 
ing way on the behavior of points of a circle under the mapping of the region 
to itself. If the angle of rotation is commensurable with a whole rotation, then 
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after several iterations of the mapping the point will return to its initial 
position (the number of iterations will be larger as the denominator of the 
fraction expressing the angle of rotation is larger). If the angle of rotation is 
incommensurable with a whole rotation, the successive images of the point 
under repetitions of the mapping will densely fill up the meridian circle. 

We note further that commensurability corresponds to resonant tori and 
incommensurability to non-resonant tori. Also, the existence of resonant 
tori implies the following property. Consider some power of the mapping of 
our region to itself induced by the phase curves. Let the exponent be the 
denominator of the fraction expressing the ratio of the frequencies on one of 
the resonant tori. Then the mapping raised to the indicated power has a 
whole circle consisting entirely of fixed points (namely, the meridian of the 
resonant torus under consideration). 

Such behavior of fixed points is unnatural for mappings in any sort of 
general form, even canonical mappings (fixed points are usually isolated). 
In the given case, a whole circle of fixed points arises because we have con- 
sidered an unperturbed integrable system. For arbitrarily small perturbations 
of general form, this property of the mapping (having a whole circle of fixed 
points) must fail. The circle of fixed points must be dispersed so that only a 
finite number remain. 

In other words, under small perturbations of our integrable system, we 
expect a change in the qualitative picture of the phase curves, if only in the 
respect that entire invariant tori filled out by closed phase curves will dis- 
integrate so that there remain only a finite number of closed curves, near 
those for the unperturbed system, and the remaining phase curves will be 
more complicated. We have already encountered such a case in Appendix 7 
in investigating phase oscillations near resonance. 

We now consider what happens to non-resonant invariant tori under a 
small perturbation of a hamiltonian function. Formal application of the 
principle of averaging (i.e., the first approximation of the classical theory of 
perturbations, cf. Section 52) leads us to the conclusion that a non-resonant 
torus does not undergo any evolution. 


We note that the fact that the perturbations are hamiltonian is essential, since for non- 
conservative perturbations it is clear that the action variables may evolve. In celestial mechanics, 
their evolution means a secular change in the major semi-axes of the keplerian ellipses, i.e., the 
planets falling into the sun, colliding, or escaping to a large distance in a time which is inversely 
proportional to the size of the perturbation. If conservative perturbations led to evolutions in 
a first approximation, this would manifest itself in the fate of the planets after a time on the 
order of 1,000 years. Fortunately, the order of magnitude of the non-conservative perturbations 
is much less. 


The theorem of Kolmogorov, formulated below, furnishes one justification 
for the conclusion, drawn from the non-rigorous theory of perturbations, 
about the absence of evolution of action variables. 
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B Invariant tori in a perturbed system 


Theorem. If an unperturbed system is nondegenerate, then for sufficiently 
small conservative hamiltonian perturbations, most non-resonant invariant 
tori do not vanish, but are only slightly deformed, so that in the phase space 
of the perturbed system, too, there are invariant tori densely filled with phase 
curves winding around them conditionally-periodically, with a number of 
independent frequencies equal to the number of degrees of freedom. 

These invariant tori form a majority in the sense that the measure of the 
complement of their union is small when the perturbation is small. 


A. N. Kolmogorov’s proof of this theorem is based on the following two 
observations. 

1. We fix a non-resonance set of frequencies of the unperturbed system so 
that the frequencies are not only independent, but do not even approximately 
satisfy any resonance conditions of low order. More precisely, we fix a set 
of frequencies w for which there exist C and v such that |(@, k)| > C|k|~” 
for all integral vectors k 4 0. 

It can be shown that, if v is sufficiently large (say v = n + 1), then the 
measure of the set of such vectors @ (lying in a fixed bounded region) for 
which the indicated condition of non-resonance is violated, is small when C 
is small. 

Next, near a non-resonant torus of the unperturbed system corresponding 
to a fixed value of the frequencies, we will look for an invariant torus of the 
perturbed system on which there is conditionally-periodic motion with 
exactly the same frequencies as the ones we fixed, and which necessarily 
satisfy the condition of being non-resonant described above. 

In this way, instead of the variations of frequency customary in perturba- 
tion schemes (consisting of the introduction of frequencies depending on the 
perturbation), we must hold constant the non-resonant frequencies, while 
selecting initial conditions depending on the perturbation in order to 
guarantee motion with the given frequencies. This can be done by a small 
(when the perturbation is small) change of initial conditions, because the 
frequencies change with the action variables according to the non-degen- 
eracy condition. 

2. The second observation is that, to find an invariant torus, instead of 
using the usual series expansion in powers of the perturbation parameter, we 
can use a rapidly convergent method similar to Newton’s method of tangents. 

Newton’s method of tangents for finding roots of algebraic equations with 
initial error ¢ gives, after n approximations, an error of order ¢?”. Such 
super-convergence allows us to paralyze the influence of the small denomin- 
ators appearing in every approximation, and in the end succeeds not only in 
carrying out an infinite number of approximations, but also in showing the 
convergence of the entire procedure. 
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The assumption under which all this can be done is that the unperturbed 
hamiltonian function H,(/) is analytic and nondegenerate, and the perturbing 
hamiltonian function ¢H,(J, @) is analytic and 27-periodic in the angle vari- 
ables @. The presence of the small parameter ¢ is immaterial: it is important 
only that the perturbation be sufficiently small in some complex neighbor- 
hood of radius p of the real plane of the variables @ (less than some positive 
function M(p, H5)). 

As J. Moser showed, the requirement of analyticity can be changed to 
differentiability of sufficiently high order if we combine Newton’s method 
with an idea of J. Nash, the application of a smoothing operator at each 
approximation. 

The resulting conditionally-periodic motions of the perturbed system with 
fixed frequencies w turn out to be smooth functions of the parameter é of 
perturbation. Therefore, they could have been sought, without Newton’s 
method, in the form of a series in powers of ¢. The coefficients of this series, 
called the Lindstedt series, can actually be found; however, we can prove its 
convergence only indirectly, with the help of newtonian approximations. 


C Zones of instability 


The presence of invariant tori in the phase space of the perturbed problem 
means that, for most initial conditions in a system which is nearly integrable, 
motion remains conditionally periodic with a maximal set of frequencies. 

The question naturally arises of what happens to the remaining phase 
curves, with initial conditions falling into the gaps between the invariant tori 
which replace the resonant invariant tori of the non-perturbed problem. 

The disintegration of a resonant torus on which the number of frequencies 
is one less than the maximum is easy to investigate in a first-order perturba- 
tion theory. To do this, we must average the perturbation over the (n — 1)- 
dimensional invariant tori into which the resonant invariant torus is 
decomposed and which are densely filled out by phase curves of the un- 
perturbed system. After averaging, we obtain a conservative system with one 
degree of freedom (cf. the investigation of phase oscillations near resonance 
in Appendix 7), which is easy to study. 

In the approximation under consideration we have, near the n-dimensional 
reducible torus, stable and unstable (n — 1)-dimensional tori, with phase 
oscillations around the stable ones. The corresponding conditionally- 
periodic motions have a full set of n frequencies, of which n — 1 are the fast 
frequencies of the original oscillations and one is the slow (of order Je) 
frequency of the phase oscillations. 

However, one must not conclude that the only difference between motions 
in the unperturbed and perturbed systems is the appearance of “islands” 
of phase oscillations. In fact, the actual phenomena are much more compli- 
cated than the first approximation described above. One manifestation of 
this complicated behavior of the phase curves of the perturbed problem is 
the splitting of separatrices discussed in Appendix 7. 
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To study motions of a perturbed system outside of the invariant tori we 
must distinguish the cases of two and higher degrees of freedom. For two 
degrees of freedom, the dimension of the phase space is four, and an energy 
level manifold is three-dimensional. Therefore, the invariant two-dimensional 
tori divide each energy level set. Thus, a phase curve beginning in the gap 
between two invariant tori of the perturbed system remains forever confined 
between those tori. No matter how complicated this curve appears, it does 
not leave its gap, and the corresponding action variables remain forever near 
their initial conditions. 

If the number n of degrees of freedom is greater than two, the n-dimen- 
sional invariant tori do not divide the (2n — 1)-dimensional energy level 
manifold but are arranged in it like points on a plane or lines in space. In this 
case the “gaps” corresponding to different resonances are connected to one 
another, so the invariant tori do not prevent phase curves starting near 
resonance from going far away. Hence, there is no reason to expect that the 
action variables along such a phase curve will remain close to their initial 
values for all time. 

In other words, under sufficiently small perturbations of systems with 
two degrees of freedom (satisfying the generally fulfilled condition of iso- 
energetic nondegeneracy), not only do the action variables along a phase 
trajectory have no secular perturbations in any approximation of perturba- 
tion theory (i.e., they change little in a time interval on the order of (1/e)" for 
any N, where ¢ is the magnitude of the perturbation), but these variables 
remain forever near their initial values. This is true, both for non-resonant 
phase curves conditionally-periodically filling out two-dimensional tori (and 
comprising most of the phase space), and for the remaining initial conditions. 

At the same time, there exist systems with more than two degrees of 
freedom satisfying all the nondegeneracy conditions, in which, although for 
most initial conditions motion is conditionally periodic, for some initial 
conditions a slow drift of the action variables away from their initial values 
occurs. The average velocity of this drift in known examples'°® is on the 
order of e~'/¥*, ie., this velocity decreases faster than any power of the 
perturbation parameter. Thus it is not surprising that this drifting away does 
not appear in any approximation of perturbation theory. (By average vel- 
ocity, we mean the ratio of the increase of action variables to time, so that 
we are actually dealing with an increase of order 1 after a time of order e'/¥*), 

An upper bound on the average velocity of the drift of the action variables 
in general nearly integrable systems of hamiltonian equations with n degrees 
of freedom is included in the recent work of N. N. Nekhoroshev.!°? 


108 Cf. V. I. Arnold, Instability of dynamical systems with many degrees of freedom. Soviet 
Mathematics 5:3 (1964) 581-585. 


109N_ N. Nekhoroshev, The behavior of hamiltonian systems that are close to integrable ones, 
Functional Analysis and Its Applications, 5:4 (1971); Uspekhi Mat. Nauk 32:6 (1977). 
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This bound, like the lower bound mentioned above, has the form e~ !/; 
thus the increase of the action variables is small while the time is small in 
comparison with e'/, if ¢ < €). Here ¢ is the magnitude of the perturbation, 
and d is a number between 0 and 1 defined, like ¢), by the properties of the 
unperturbed hamiltonian H,. In addition, a nondegeneracy condition is 
imposed on the unperturbed hamiltonian (this condition has a long formula- 
tion, but is generally satisfied; in particular, strong convexity of the un- 
perturbed hamiltonian is sufficient, i.e., positive or negative definiteness of 
the second differential of Ho). 

From this upper bound it is clear that secular changes of the action vari- 
ables are not detected by any approximation of perturbation theory, since 
the average velocity of these changes is exponentially small. We note also 
that secular changes of the action variables obviously have no directional 
character, but are represented by more or less random wandering in the 
resonant regions between the invariant tori. A more detailed discussion of 
the questions arising here can be found in the article, “Stochastic instability 
of nonlinear oscillations,” by G. M. Zaslavski and B. V. Chirikov, Soviet 
Physics Uspekhi, v. 105, no. 1 (1971), 3-39. 


D Variants of the theorem on invariant tori 


Statements analogous to the theorem on conservation of invariant tori in an 
autonomous system have been proved for non-autonomous equations with 
periodic coefficients and for symplectic mappings. Analogous statements are 
valid in the theory of small oscillations in a neighborhood of an equilibrium 
position of an autonomous system or a system with periodic coefficients, as 
well as in a neighborhood of a closed phase curve of a phase flow or in a 
neighborhood of a fixed point of a symplectic mapping. 

The nondegeneracy conditions necessary in the various cases are different. 
For reference, we will now give these nondegeneracy conditions. We will 
limit ourselves to the simplest requirements of nondegeneracy, which are all 
fulfilled by systems in “general position.” In many cases, the requirements 
of nondegeneracy can be weakened, but the advantage gained by this is offset 
by the complication of the formulas. 


1. Autonomous systems. The hamiltonian function is 


H = HU) + ¢Hy, 9), IEG CR’, @ mod 27€ T". 


The nondegeneracy condition 
2 


det ee 


or #0 


110 


guarantees preservation’*® of most invariant tori under small perturbations 


(e < 1). 
‘1 It is understood that the tori are slightly deformed under perturbations. 
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The condition for isoenergetic nondegeneracy 


Hy Ho 
or? ol 
det ba #0 
10) 
él o 


guarantees the existence on every energy level manifold of a set of invariant 
tori whose complement has small measure. The frequencies on these tori 
generally depend on the size of the perturbation, but the ratios of frequencies 
are preserved under changes in «. 

Ifn = 2, then the condition for isoenergetic nondegeneracy also guarantees 
stability of the action variables, in the sense that they remain forever close to 
their initial values for sufficiently small perturbations. 

2. Periodic systems. The hamiltonian function is 


H = H,(J) + ¢H,(, @, 0), ITeGc R", 9 mod 2xeET"; 


the perturbation is 2z-periodic not only in g, but also in ¢. It is natural to look 
at the unperturbed system in the (2n + 1)-dimensional space {(/, 9, t)} = 
R" x T"*!. The invariant tori have dimension n + 1. The nondegeneracy 
condition 

2 


0-H 
det] 


guarantees the preservation of most (nm + 1)-dimensional invariant tori under 
a small perturbation (¢ < 1). 

If n = 1, this nondegeneracy condition also guarantees stability of the 
action variable, in the sense that it remains forever near its initial value for 
sufficiently small perturbations. 

3. Mappings (I, ~) > UI’, ¢') of the “2n-dimensional annulus.” The gener- 
ating function is 


SI’, 9) = Sol’) + Si, 9), = eG oR’, peT”. 


The nondegeneracy condition 


#0 


al? 
guarantees the preservation of most invariant tori of the unperturbed map- 
ping (I, ¢) > UI, » + (0S,/eI) under small perturbations (¢ < 1). 

Ifn = 1, we obtain an area-preserving mapping of the ordinary annulus to 
itself. The unperturbed mapping is represented on each circle J = const as a 
rotation. In this case the nondegeneracy condition means that the angle of 
rotation changes from one circle to another. 

The invariant tori in the case n = 1 are ordinary circles. In this case, the 
theorem guarantees that under iterations of the mapping all the images of a 


det #0 
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point will remain near the circle on which the original point lay, if the 
perturbation is sufficiently small. 

4. Neighborhoods of equilibrium positions (autonomous case). An equili- 
brium position is assumed to be stable in a linear approximation so that n 
characteristic frequencies w,,..., @, are defined. We assume that there are no 
resonance relations among the characteristic frequencies, i.e., no relations 


k,@, +-::+k,@, =0 with integers k; such that 0 < }'|k,| < 4. 
Then the hamiltonian function can be reduced to the Birkhoff normal form 
(cf. Appendix 7) 

H = H,(t)+-:-:, 


where H)(t) = )\ ot + 3 > Ott, and the dots denote terms of degree 
higher than four with respect to the distance from the equilibrium position. 
The nondegeneracy condition 


det|@,,| # 0 


guarantees the existence of a set of invariant tori of almost full measure in a 
sufficiently small neighborhood of the equilibrium position. 
The condition for isoenergetic nondegeneracy, 


Dy Op 


det 5.0 


# 0, 


guarantees the existence of such a set of invariant tori on every energy level 
set (sufficiently close to the critical point). 

In the case n = 2, the condition for isoenergetic nondegeneracy is satisfied 
if the quadratic part of the function H is not divisible by the linear part. In 
this case, isoenergetic nondegeneracy guarantees Liapunov stability of the 
equilibrium position. | 

5. Neighborhoods of equilibrium positions (periodic case). Here again we 
assume stability in a linear approximation, so that n characteristic fre- 
quencies @,,..., @, are defined. We assume that there are no resonance 
relations 


kyo, +++ + k,@, + kg =0 withO< ¥ [k,l <4 
i=1 


i= 


among the characteristic frequencies and the frequency of the time-depen- 
dence of the coefficients (which we will assume equal to 1). 

Then the hamiltonian function can be reduced to a Birkhoff normal form 
in the same way as in the autonomous case, but with 27-periodicity with 
respect to time in the remainder term. 

The nondegeneracy condition 


det |@,.| # 0 
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guarantees the existence of (n + 1)-dimensional invariant toriin the (2n + 1)- 
dimensional extended phase space, near the circle t = 0 representing the 
equilibrium position. 

In the case n = 1 the nondegeneracy condition reduces to the non-vanish- 
ing of the derivative of the period of small oscillations with respect to the 
square of the amplitude of small oscillations. In this case, nondegeneracy 
guarantees that the equilibrium position is Liapunov stable. 

6. Fixed points of mappings. Here we assume that all 2n eigenvalues of the 
linearization of a canonical mapping at a fixed point have modulus 1 and do 
not satisfy any low-order resonance relations of the form: 


AeA =A, kyl too t+ hal 4 


(where the 2n eigenvalues are A,,...,4,,74,..., A,)- 
Then if we disregard terms of higher than third order in the Taylor series 
at the fixed point, the mapping can be written in Birkhoff normal form 


(t, 9) > (t, @ + a(t)), where a(t) = - 


S=¥ at +4 Y, Out (the usual coordinates in a neighborhood of the 


equilibrium position are p, = ./2t, COS @,, q, = ./2T, sin @,). 
The nondegeneracy condition 


det || # 0 


guarantees the existence of n-dimensional invariant tori (close to the tori 
t= const), forming a set of almost full measure in a sufficiently small 
neighborhood of the equilibrium position. 

If n = 1, we have a mapping of the ordinary plane to itself, and the 
invariant tori become circles. The nondegeneracy condition means that, for 
the normal forza, the derivative of the angle of rotation of a circle with respect 
to the area bounded by the circle is not zero (at the fixed point and, therefore, 
in some neighborhood of it). 

In the case n = 1 the nondegeneracy condition guarantees Liapunov 
stability of the fixed point of the mapping. We note that in this case the con- 
dition of absence of lower resonance has the form 


BM #1 At #1, 


Thus a fixed point of an area-preserving mapping of the plane to itself is 
Liapunov stable if the linear part of the mapping is rotation through an angle 
which is not a multiple of 90° or 120° and if the coefficient w, , in the normal 
Birkhoff form is not zero (guaranteeing nontrivial dependence of the angle 
of rotation on the radius). 

We have not gone into the smoothness conditions assumed in these 
theorems. The minimal smoothness needed is not known in even one case. 
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For example, we point out that the last assertion about stability of fixed 
points of a mapping of the plane to itself was first proved by J. Moser under 
the assumption of 333-times differentiability, and only later (by Moser and 
Rtissman) was the number of derivatives reduced to 6. 


E Applications of the theorem on invariant tori 
and its generalizations 


There are many mechanical problems to which we can apply the theorem 
formulated above. One of the simplest of these problems is the motion of a 
pendulum under the action of a periodically changing exterior field or under 
the action of vertical oscillations of the point of suspension. 

It is well known that, in the absence of parametric resonance, the lower 
equilibrium position of a pendulum is stable in the linear approximation. The 
stability of this position with regard to nonlinear effects (under the further 
assumption of the absence of resonances of order 3 and 4) can be proved 
only with the help of the theorem on invariant tori. 

In an analogous way we can use the theorem on invariant tori to investigate 
conditionally-periodic motions of a system of interacting nonlinear os- 
cillators. 

Another example is the geodesic flow on a convex surface close to an 
ellipsoid. There are two degrees of freedom in this system, and we can show 
that most geodesics on a three-dimensional near-ellipsoidal surface oscillate 
between two “caustics” close to the lines of curvature of the surface, densely 
filling out the ring between them. At the same time, we can arrive at theorems 
on the stability of the two closed geodesics obtained, after deforming the 
surface, from the two ellipses containing the middle axis of the ellipsoid (in 
the absence of resonances of orders 3 and 4). 

As one more example, we can look at closed trajectories on a billiard table 
of any convex shape. Among the closed billiard trajectories are those which 
are stable in the linear approximation, and we can conclude that in the 
general case they are actually stable. An example of such a stable billiard 
trajectory is the minor axis of an ellipse; therefore, a closed billiard trajec- 
tory, close to the minor axis of an ellipse on a billiard table which is almost 
the ellipse, is stable. 

Application of the theorem on invariant tori to the problem of rotations 
of an asymmetric heavy rigid body allows us to consider the nonintegrable 
case of a rapidly rotating body. The problem of rapid rotation is mathe- 
matically equivalent to the problem of motion with moderate velocity in a 
weak gravitational field: the essential parameter is the ratio of potential to 
kinetic energy. If this parameter is small, then we can use eulerian motion of 
a rigid body as a first approximation. 

By applying the theorem on invariant tori to the problem with two degrees 
of freedom obtained after eliminating cyclic coordinates (rotations around 
the vertical) we come to the following conclusion about the motion of a 
rapidly rotating body: if the kinetic energy of rotation of a body is sufficiently 
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large in comparison with the potential energy, then the length of the vector of 
angular momentum and its angle with the horizontal remain forever close 
to their initial values. 

It follows from this that the motion of the body will forever be close to a 
combination of Euler-Poinsot motion and azimuthal procession, except in 
the case when the initial values of kinetic energy and total momentum are 
close to those for which the body can rotate around the middle principal axis. 
In this last case, realized only for special initial conditions, the splitting of 
separatrices near the middle axis implies a more complicated undulation 
about the middle axis than in Euler-Poinsot motion. 

One generalization of the theorem on invariant tori leads to the theorem 
on the adiabatic invariance for all time of the action variable in a one- 
dimensional oscillating system with periodically changing parameters. Here 
we must assume that the rule for changing parameters is given by a fixed 
smooth periodic function of “slow time,” and the small parameter of the 
problem is the ratio of the period of characteristic oscillations and the period 
of change of parameters. Then, if the period of change of parameters is suffi- 
ciently large, the change in the adiabatic invariant of a phase point remains 
small in the course of an infinite interval of time. 

In an analogous way we can prove the adiabatic invariance for all time 
of the action variable in the problem of a charged particle in an axially- 
symmetric magnetic field. Violation of axial symmetry in this problem in- 
creases the number of degrees of freedom from two to three, so that the 
invariant tori cease to divide the energy level manifolds, and the phase curve 
wanders about the resonance zones. 

Finally, applying the theory to the three- (or many-) body problem, we 
succeed in finding conditionally periodic motions of “planetary type.” To 
describe these motions, we must say a few words about the next approxima- 
tion after the keplerian one in the problem of the motion of the planets. For 
simplicity we will limit ourselves to the planar problem. 

For each keplerian ellipse, consider the vector connecting the focus of the 
ellipse (i.e., the sun) to the center of the ellipse. This vector, called the Laplace 
vector, characterizes both the magnitude of the eccentricity of the orbit and the 
direction to the perihelion. 

The interaction of the planets on one another causes the keplerian 
ellipse (and therefore the Laplace vector) to change slowly. In addition, there 
is an important difference between changes in the major semi-axis and 
changes in the Laplace vector. Namely, the major semi-axis has no secular 
perturbations, i.e., in the first approximation it merely oscillates slightly 
around its average value (“Laplace’s theorem”). The Laplace vector, on the 
other hand, performs both periodic oscillations and secular motion. The 
secular motion may be obtained if we spread each planet over its orbit 
proportionally to the time spent in travelling each piece of the orbit, and 
replace the attraction of the planets by the attraction of the rings obtained, 
that is, if we average the perturbation over the rapid motions. The true 


413 


Appendix 8: Theory of perturbations of conditionally periodic motion 


motion of the Laplace vector is obtained from the secular one by the addi- 
tion of small oscillations; these oscillations are essential if we are interested 
in small intervals of time (years), but their effect remains small in comparison 
to the effect of the secular motion if we consider a large interval of time 
(thousands of years). 

Calculations (carried out by Lagrange) show that the secular motion of 
the Laplace vector of each of n planets moving in one plane consists of the 
following (if we ignore the squares of the eccentricities of the orbits which 
are small in comparison with the eccentricities themselves). In the orbital 
plane of a planet we must arrange n vectors of fixed lengths, each rotating 
uniformly with its angular velocity. The Laplace vector is their sum. 

This description of the motion of the Laplace vector is obtained because 
the hamiltonian system averaged with respect to rapid motions, which 
describes the secular motion of the Laplace vector, has an equilibrium posi- 
tion corresponding to zero eccentricities. The described motion of the Lap- 
lace vector is the decomposition of small oscillations near this equilibrium 
position into characteristic oscillations. The angular velocities of the uni- 
formly rotating components of the Laplace vector are the characteristic 
frequencies, and the lengths of these components determine the amplitudes 
of the characteristic oscillations. 


We note that the motion of the Laplace vector of the earth is, apparently, one of the factors 
involved in the occurrence of ice ages. The reason is that, when the eccentricity of the earth’s 
orbit increases, the time it spends near the sun decreases, while the time it spends far from the 
sun increases (by the law of areas); thus the climate becomes more severe as the eccentricity 
increases. The magnitude of this effect is such that, for example, the amount of solar energy 
received ina year at the latitude of Leningrad (60°N) may attain the value which now corresponds 
to the latitudes of Kiev (50°N) (for decreased eccentricity) and Taimir (80°N) (for increased 
eccentricity). The characteristic time of variation of the eccentricity (tens of thousands of years) 
agrees well with the interval between ice ages. 


The theorems on invariant tori lead to the conclusion that for planets of 
sufficiently small mass, there is, in the phase space of the problem, a set of 
positive measure filled with conditionally periodic phase curves such that 
the corresponding motion of the planets is nearly motion over slowly 
changing ellipses of small eccentricities, and the motion of the Laplace 
vectors is almost that given by the approximation described above. Further- 
more, if the masses of the planets are sufficiently small, then motions of this 
type fill up most of the region of phase space corresponding in the keplerian 
approximation to motions of the planets in the same direction over non- 
intersecting ellipses of small eccentricities. 

The number of degrees of freedom in the planar problem with n planets 
is equal to 2n if we take the sun to be fixed. The integral of angular momentum 
allows us to eliminate one cyclic coordinate; however, there are still too 
many variables for the invariant tori to divide an energy level manifold (even 
if there are only two planets this manifold is five-dimensional, and the tori 
are three-dimensional). Therefore, in this problem we cannot draw any con- 
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clusions about the preservation of the large semi-axes over an infinite interval 
of time for all initial conditions, but only for most initial conditions. 

A problem with two degrees of freedom is obtained by further idealization. 
We replace one of the two planets by an “asteroid” which moves in the field 
of the second planet (“Jupiter”), not perturbing its motion. 

The problem of the motion of such an asteroid is called the restricted 
three-body problem. The planar restricted three-body problem reduces to a 
system with two degrees of freedom, periodically depending on time, for the 
motion of the asteroid. If, in addition, the orbit of Jupiter is circular, then in a 
coordinate system rotating together with it we obtain, for the motion of the 
asteroid, an autonomous hamiltonian system with two degrees of freedom— 
called the planar restricted circular three-body problem. 

In this problem, there is a small parameter—the ratio of the masses of 
Jupiter and the sun. The zero value of the parameter corresponds to un- 
perturbed keplerian motion of the asteroid, represented in our four-dimen- 
sional phase space as a conditionally-periodic motion on a two-dimensional 
torus (since the coordinate system is rotating). One of the frequencies of this 
conditionally-periodic motion is equal to 1 for all initial conditions; this is 
the angular velocity of the rotating coordinate system, i.e., the frequency of 
the revolution of Jupiter around the sun. The second frequency depends on 
the initial conditions (this is the frequency of the revolution of the asteroid 
around the sun) and is fixed on any fixed three-dimensional level manifold 
of the hamiltonian function. 

Therefore, the nondegeneracy condition is not fulfilled in our problem, but 
the condition for isoenergetic nondegeneracy is fulfilled. Kolmogorov’s 
theorem applies, and we conclude that most invariant tori with irrational 
ratios of frequencies are preserved in the case when the mass of the perturbing 
planet (Jupiter) is not zero, but sufficiently small. 

Furthermore, the two-dimensional invariant tori divide the three- 
dimensional level manifolds of the hamiltonian function. Therefore, the 
magnitude of the major semi-axis and the eccentricity of the keplerian 
ellipse of the asteroid will remain forever near their initial values if, at the 
initial moment, the keplerian ellipse does not intersect the orbit of the 
perturbing planet, and if the mass of this planet is sufficiently small. 

In addition, in a stationary coordinate system, the keplerian ellipse of the 
asteroid could slowly rotate, since our system is only isoenergetically non- 
degenerate. Therefore under perturbations of an invariant torus frequencies 
are not preserved, but only their ratios. As a result of a perturbation, the 
frequency of azimuthal motion of the perihelion of the asteroid in a stationary 
coordinate system could be slightly different from Jupiter’s frequency, and 
then in the stationary system the perihelion would slowly rotate. 
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In his study of periodic solutions of problems in celestial mechanics, H. 
Poincaré constructed a very simple model which contains the basic difficulties 
of the problem. This model is an area-preserving mapping of the planar 
circular annulus to itself. Mappings of this form arise in the study of dynam- 
ical systems with two degrees of freedom. In fact, a mapping of a two- 
dimensional surface of section to itself is defined as follows: each point p of 
the surface of section is taken to the next point at which the phase curve 
originating at p intersects the surface (cf. Appendix 7). Thus, a closed phase 
curve corresponds to a fixed point of the mapping or of a power of the 
mapping. Conversely, every fixed point of the mapping or of a power of 
the mapping determines a closed phase curve. 

In this way, a question about the existence of periodic solutions of prob- 
lems in dynamics is reduced to a question about fixed points of area-pre- 
serving mappings of the annulus to itself. In studying such mappings, 
Poincaré arrived at the following theorem. 


A Fixed points of mappings of the annulus to itself 


Theorem. Suppose that we are given an area-preserving homeomorphic mapping 
of the planar circular annulus to itself. Assume that the boundary circles of 
the annulus are turned in different directions under the mapping. Then this 
mapping has at least two fixed points. 


The condition that the boundary circles are turned in different directions 
means that, if we choose coordinates (x, y mod 27) on the annulus so that the 
boundary circles are x = a and x = b, then the mapping is defined by the 
formula 


(x, y) + (f(y), y + 9 y)), 


where the functions f and g are continuous and 2z-periodic in y, with 
f(a y) = a, f(b, y) = b, and g(a, y) < 0, g(b, y) > 0 for all y. 

The proof of this theorem, announced by Poincaré not long before his 
death, was given only later by G. D. Birkhoff (cf. his book, Dynamical 
Systems, Amer. Math. Soc., 1927). 

There remain many open questions related to this theorem; in particular, 
attempts to generalize it to higher dimensions are important for the study 
of periodic solutions of problems with many degrees of freedom. The argu- 
ment Poincaré used to arrive at his theorem applies to a whole series of other 
problems. However, the intricate proof given by Birkhoff does not lend itself 
to generalization. Therefore, it is not known whether the conclusions sug- 
gested by Poincaré’s argument are true beyond the limits of the theorem on 
the two-dimensional annulus. The argument in question is the following. 
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B The connection between fixed points of a 
mapping and critical points of the generating 
function 


We will define a symplectic diffeomorphism of the annulus 
(x, y) > (X, Y) 


with the help of the generating function Xy + S(X, y), where the function S 
is 2n-periodic in y. For this to be a diffeomorphism we need that 0X/dx # 0. 
Then 


dS = (x — X)dy + (Y — y)dX, 


and, therefore, the fixed points of the diffeomorphism are critical points of 
the function F(x, y) = S(X(x, y), y). This function F can always be constructed 
by defining it as the integral of the form (x — X)dy + (Y — y)dX. The 
gradient of this function is directed either inside the annulus or outside on 
both boundary circles at once (by the condition on rotation in different 
directions). 

But every smooth function on the annulus whose gradient on both bound- 
ary circles is directed inside the annulus (or out from it) has a critical point 
(maximum or minimum) inside the annulus. Furthermore, it can be shown 
that the number of critical points of such a function on the annulus is at least 
two. Therefore, we could assert that our diffeomorphism has at least two 
critical points if we were sure that every critical point of F is a fixed point of 
the mapping. 

Unfortunately, this is true only under the condition that 0X/dx 4 0, so 
that we can express F in terms of X and y. Thus our argument is valid 
for mappings which are not too different from the identity. For example, it is 
sufficient that the derivatives of the generating function S be less than 1. 

A refinement of this argument (with a different choice of generating 
function!!') shows that it is even sufficient that the eigenvalues of the Jacobi 
matrix D(X, Y)/D(x, y) never be equal to —1 at any point, ie., that our 
mapping never flips the tangent space at any point. Unfortunately, all such 
conditions are violated at some points for mappings far from the identity. 
The proof of Poincaré’s theorem in the general case uses entirely different 
arguments. 

The connection between fixed points of mappings and critical points of 
generating functions seems to be a deeper fact than the theorem on mappings 
of a two-dimensional annulus into itself. Below, we give several examples in 
which this connection leads to meaningful conclusions which are true under 
some restrictions whose necessity is not obvious. 

Bu X-x Y-y 


d® =4 
dX +dx dY +dy 
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C Symplectic diffeomorphisms of the torus 


Consider a symplectic diffeomorphism of the torus which fixes the center of 
gravity 


(x, y) + (x + f(x, y), y + g(x, y)) = (X, Y), 


where x and y mod 2z are angular coordinates on the torus, “symplectic” 
means the Jacobian D(X, Y)/D(x, y) is equal to 1, and the condition on 
preserving the center of gravity means that the average values of the functions 
f and g are equal to zero. 


Theorem. Such a diffeomorphism has at least four fixed points, counting 
multiplicity, and at least three geometrically different ones, at least under the 
assumption that the eigenvalues of the Jacobi matrix are not equal to —1 at 
any point. 


The proof is based on consideration of the function on the torus given by 
the formula 


Ox, ») =4 fo — xd¥ + dy) — (¥ — ydX + do), 


and on the fact that a smooth function on the torus has at least four critical 
points (counting multiplicity) of which at least three are geometrically 
different. 

Attempts at proving this theorem without restrictions on the eigenvalues 
meet with difficulties very similar to those encountered by Poincaré in the 
theorem about the annulus. 


We note that the theorem about the annulus would follow from the theorem about the torus 
if in the latter we could throw out the condition on the eigenvalues. In fact, we can put together 
a torus from two copies of our annulus, inserting a narrow connecting annulus along each of 
the two boundary circles. 

Then we can extend our mapping of the annulus to a symplectic diffeomorphism of the 
torus such that: (1) on each of the two large annuli the diffeomorphism coincides with the 
original, (2) on each of the connecting annuli the diffeomorphism has no fixed points, and (3) 
the center of gravity remains fixed. 

The construction of such a diffeomorphism of the torus uses the property that the boundary 
circles rotate in different directions. On each connecting annulus all points are translated in the 
same direction as on both circles bounding the connecting annulus. Since the translations on 
the connecting annuli are in opposite directions, the size of the translations can be chosen to 
ensure preservation of the center of gravity. 

Now out of four fixed points on the torus, two must lie in the original annulus, and we obtain 
the theorem on annuli from the theorem on tori. 


The theorem on tori formulated above can be generalized to other 
symplectic manifolds, both two-dimensional and many-dimensional. To 
formulate these generalizations, we must first reformulate the condition of 
preservation of the center of gravity. 
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Let g: M > M bea symplectic diffeomorphism. We say that g is homolo- 
gous to the identity if it can be connected to the identity diffeomorphism 
by a smooth curve g, consisting of symplectic diffeomorphisms such that 
the field of velocities g, at each moment of time t has a single-valued hamil- 
tonian function. It can be shown that the symplectic diffeomorphisms homo- 
logous to the identity form the commutator subgroup of the connected 
component of the identity in the group of all symplectic diffeomorphisms of 
the manifold. 

In the case when our manifold is the two-dimensional torus, the sym- 
plectic diffeomorphisms homologous to the identity are exactly those which 
preserve the center of gravity. 

Thus we come to the following generalization of Poincaré’s theorem. 


Theorem. Every symplectic diffeomorphism of a compact symplectic manifold, 
homologous to the identity, has at least as many fixed points as a smooth 
function on this manifold has critical points (at least if this diffeomorphism 
is not too far from the identity).''? 


We note that the condition of the mapping being homologous to the 
identity is essential, as we see already from the example of a translation on 
the torus, which has no fixed points at all. 

As to the last restriction (that the diffeomorphism be not too far from the 
identity), it is not clear whether it is essential.!!?* In the case that our manifold 
is the two-dimensional torus, it is sufficient that none of the eigenvalues of the 
Jacobi matrix of the diffeomorphism (in any global symplectic coordinate 
system on R2") be equal to minus one. 


A restriction of this sort may be necessary in higher-dimensional problems. It is not im- 
possible that Poincaré’s theorem is due to an essentially two-dimensional effect, as is the 
following theorem of A. I. Shnirel’man and N. A. Nikishin: every area-preserving diffeomorphism 
of the two-dimensional sphere to itself has at least two geometrically different fixed points. 

The proof of this theorem is based on the fact that the index of the gradient vector field 
of a smooth function of two variables at an isolated critical point cannot be greater than 1 
(although it can be equal to 1,0, —1, —2, —3,...), and the sum of the indices of all the fixed 
points of an orientation-preserving diffeomorphism of the two-dimensional sphere to itself 
is equal to 2. On the other hand, the index of the gradient of a smooth function of a large number 
of variables at a critical point can take any integer value. 


D Intersections of lagrangian manifolds 


Poincaré’s argument can be given a slightly different form if on every 
radius of the annulus we consider the points shifted only radially. There are 
such points on every radius, since the boundary circles of the annulus turn 


"12 [For a proof, see V. Arnold, Sur les propriétés topologiques des applications globalement 
canoniques de la mécanique classique, C. R. Acad. Sci. Paris, 1965 and A. Weinstein, Symplectic 
manifolds and their lagrangian submanifolds, Advances in Math. 6 (1971) 329-346.] 


‘122 Recently, Conley and Zehnder, followed by others, have proved the theorem for tori, 
surfaces, and other manifolds, without the restriction of closeness to the identity. ] 
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in different directions. Assume that we can make a smooth curve of radially 
shifting points, separating the interior and exterior circles of the annulus. 
Then the image of this curve under our mapping must intersect the curve 
(since the regions into which the curve divides the annulus are carried to 
regions of equal area). 

If this curve and its image each intersect each radius once, then the points 
of intersection of the curve with its image are obviously fixed points of the 
mapping. 

Part of this argument can be carried out in higher dimensions, and this 
gives useful results about periodic solutions of problems in dynamics. The 
role of the annulus in the many-dimensional case is played by the phase 
space: the direct product of a region in euclidean space with a torus of the 
same dimension (the annulus is the product of an interval with the circle). 
A symplectic structure on the phase space is defined in the usual way, i.e., it has 
the form Q = )' dx, A dy,, where the x, are action variables and y, are angle 
variables. 

It is not difficult to explain which symplectic diffeomorphisms of our 
phase space are homologous to the identity. Namely, a symplectic diffeo- 
morphism A is homologous to the identity if it can be obtained from the 
identity by a continuous deformation and if 


px dy = f x dy 
y Ay 


for any closed contour y (not necessarily homologous to zero). The condition 
that the transformation be homologous to the identity prohibits systematic 
shifts along the x-direction (“evolution of the action variables”), but permits 
shifts along the tori. 

We consider one of the n-dimensional tori x = c = const and apply to 
it our symplectic diffeomorphism homologous to the identity. It turns out 
that the original torus intersects its image in at least 2” points (counting 
multiplicities), of which at least n + 1 are geometrically different, at least 
under the assumption that the image torus has an equation of the form 
x = f(y), where f is smooth. 

For n = 1, this assertion means that each of the concentric circles con- 
stituting the annulus intersects its image in at least two points. This also 
follows from the preservation of area, so that the assumption that the image 
has equation x = f(y) is not necessary. 

Whether or not this assumption is necessary in higher dimensions is not 
known. If we make this assumption, the proof proceeds in the following way. 

We note that the original torus. is a lagrangian submanifold of phase 
space. Our diffeomorphism is symplectic, so the image torus is also lagrang- 
ian. Therefore, the 1-form (x — c)dy on it is closed. Furthermore, this form 
on the torus is the total differential of some single-valued smooth function F, 
since our diffeomorphism is homologous to the identity, and therefore for 
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any closed contour y we have 


p x dy — f cdy = fx dy - p c dy 
Ay Ay y Ay 


=e fdy—c dy = 0. 
Y Ay 


fe ~ ody 


We note that points of intersection of the torus with its image are critical 
points of the function F (since at them dF = (x — c)dy = 0). 

From the condition of single-valued projection of the image torus (i.e., 
from the fact that the image torus has equation x = f(y)) it follows that, 
conversely, all critical points of the function F are points of intersection of 
our tori. In fact, under these conditions y can be taken for local coordinates 
on the torus, and therefore the fact that dF is zero for all vectors tangent to 
the image torus implies x = c. 

A smooth function on an n-dimensional torus has at least 2” critical points, 
counting multiplicities, of which at least n + 1 are geometrically different 
(cf., for example, Milnor, “ Morse Theory,” Princeton University Press, 1967). 

Therefore, our tori intersect in at least 2” points (counting multiplicities), 
and there are at least n + 1 geometrically different points of intersection. 

Exactly the same argument shows that any lagrangian torus intersects 
its image in at least 2” points (of which at least n + 1 are geometrically 
different), under the assumption that both the original torus and its image 
project single-valued onto the y-space, i.e., are given by equations y = f(x) 
and x = g(y), respectively. Besides, this statement reduces to the previous 
one by the canonical transformation (x, y) > (x — f(y), y). 


E Applications to determining fixed points and periodic solutions 


We now consider a symplectic transformation, homologous to the identity, 
of the special form which arises in integrable problems in dynamics, i.e., of 
the form 


Ao(x, y) = (x, y + w(x)), where w = s . 


Here x € R" is the action variable and y mod 27 T” is the angular coordin- 
ate. 

We assume that on the torus x = Xg all the frequencies are commensur- 
able: 


k, 
(Xo) = N 2x with integers k;, N; w(xo) # 0, 
and that the nondegeneracy condition 
det os #0 
OX |x, 


is satisfied. 
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Theorem. Every symplectic diffeomorphism A homologous to the identity and 
sufficiently close to Ag has, near the torus x = Xo, at least 2" periodic 
points € of period N (such that ANE = €), counting multiplicity. 


The proof could be reduced to investigating the intersection of two lagrangian submanifolds 
of a 4n-dimensional space (R" x T” x R" x T") with Q = dx A dy — dX a dY, one of which 
is the diagonal (X = x, Y = y) and the other the graph of the mapping A’. 

However, it is easier to directly construct a suitable function on the torus. In fact, the map- 
ping A has the form 
0a 


# 0. 


xo 


(x, vy) > (x, y + &(x)), where a(x 9) = 0, det 


By the implicit function theorem, the mapping A™ has, near the torus x = xo, a torus which is 
displaced only radially ((x, y) > (X, Y)) and is given by an equation of the form x = f(y); 
its image is also given by an equation x = g(y) of the same form. In this notation, X( f(y), y) = 
gy), Y(FQ), y) = y- 

Since A is homologous to the identity, it follows that A% has a single-valued global generating 
function of the form Xy + S(X, y), where S has period 27 in the variable y. 

The function F(y) = S(X({(y), y), y) has at least 2” critical points », on the torus. All the 
points €, = (f(),), Vx) are fixed points for A%. In fact, 


dF = (x — X)dy + (Y — y)dX = (x — X) dy = (f(y) — gy) dy. 


Therefore, since dF |,, = 0, it follows that f(y,) = g(y,), Le. ANE, = &, as was to be shown. 


We turn now to closed orbits of conservative systems. Using the term- 
inology of Appendix 8, we can formulate the result as follows. 


Corollary. Upon disintegration of an n-dimensional torus, entirely filled up by 
closed trajectories of an isoenergetically nondegenerate system, at least 
2"~' closed trajectories of the perturbed problem are formed (counting 
multiplicities), among which at least n are geometrically distinct, at least 
if the perturbation is sufficiently small. 


The proof is reduced to the preceding theorem with the help of a (2n — 2)- 
dimensional surface of section. We must first choose angular coordinates y 
such that the closed trajectories of the unperturbed problem on the torus 
are given by the equations y, = --- = y, = 0, and then define a surface of 
section by y, = 0. 

In the case of two degrees of freedom we can apply Poincaré’s theorem to 
the annuli formed by intersecting invariant tori with a two-dimensional 
intersecting surface. We obtain the following result: 

In the gap between two two-dimensional invariant tori of a system with 
two degrees of freedom there are always at least two closed phase trajectories, 
if the ratio of the frequencies of conditionally-periodic motions on these tori 
are different. 

In this way we obtain many periodic solutions in all problems with two 


422 


Appendix 9: Poincaré’s geometric theorem, its generalizations and applications 


degrees of freedom, where invariant tori are found (for example, in the bound- 
ed circular three-body problem, in the problem of closed geodesics, etc.). 
There is even a conjecture that in hamiltonian systems of “ general form” with 
compact phase spaces, the closed phase curves form a dense set.!!> How- 
ever, if this is true, the closedness of most of these curves has little importance 
since their periods are extremely large. 

As an example of applying Poincaré’s methods to systems with more than 
two degrees of freedom, we have a theorem of Birkhoff about the existence of 
infinitely many periodic solutions close to a given linearly stable periodic 
solution of general form (or about the existence of infinitely many periodic 
points in a neighborhood of a fixed point of a linearly stable nondegenerate 
symplectic mapping of a space to itself). In the proof, the mapping is first 
approximated by its normal form, and then the connection between fixed 
points of a mapping and critical points of the generating function is used. 

Knowing periodic solutions allows us, among other things, to prove the 
nonexistence of first integrals (other than the classical ones) in many problems 
in dynamics. Assume, for example, that on some level manifold of known 
integrals we discover a periodic trajectory which is unstable. Its separatrices, 
in general, form a complicated network, which we considered in Appendix 7. 
If this phenomenon of splitting of separatrices is discovered, and if we can 
show that the separatrices are not contained in any manifold of lower dimen- 
sion than the level manifold we are considering, then we can be sure that the 
system has no new first integrals. 

The complicated behavior of phase curves, which obstructs the existence 
of first integrals, can often be detected without the help of periodic solutions 
by one simple glance at the picture, obtained by a computer, formed by the 
intersection of the phase curves with the surface of section. 


F Invariance of generating functions 


We have already noted the discouraging noninvariance of generating 
functions with respect to the choice of a canonical coordinate system on a 
symplectic manifold. On the other hand, we repeatedly used the connection 
between fixed points of a mapping and critical points of the generating 
function. 

It turns out that, although generally the generating function is not in- 
variantly associated to the mapping, near a fixed point there is an invariant 
connection. More precisely, suppose we are given a symplectic diffeo- 
morphism fixing some point. In a neighborhood of this point, we define a 
“generating function” 


xX, —X y%—- y 
o=1 k k k k 
pwr dY, + dy, 


113 A proof of this density in the C!-topology has been announced by C. Pugh and C. Robinson. 
[Editor’s note] 


423 


Appendix 9: Poincaré’s geometric theorem, its generalizations and applications 


with the help of some symplectic coordinate system (x, y).!'* Using another 
symplectic coordinate system (x’, y’), we construct a generating function ®’ 
in the same way. 


Theorem. If the linearization of the symplectic diffeomorphism at the fixed 
point has no eigenvalues equal to — 1, then the functions ® and 9’ are equiva- 
lent in a neighborhood of the fixed point, in the sense that there is a diffeo- 
morphism g (in general not symplectic) such that 


@(z) = O'(g(z)) + const. 


For the proof see the article: A. Weinstein, The invariance of Poincaré’s 
generating function for canonical transformations, Inventiones Mathe- 
maticae, 16, No. 3 (1972), 202-214. 

It should be noted that two diffeomorphisms with generating functions 
which are equivalent in a neighborhood of a fixed point are not necessarily 
equivalent in the class of symplectic diffeomorphisms (for example, rotation 
and rotation through an angle which depends on the radius, with non- 
degenerate quadratic parts of the generating function at zero). 

Since the first edition of this book had appeared in 1974, the content of 
this Appendix has grown into a new branch of mathematics: symplectic 
topology. To describe this development (triggered by the conjectures in this 
Appendix, which still remain, for general manifolds, neither proved, nor 
disproved) one would need a book longer than the present one. 

The interested reader might follow this development using the (incomplete) 
bibliography on pages 503-509. 


‘14 The increase of this function along any arc is equal to the integral of the form defining the 
symplectic structure over the band formed by the rectilinear intervals connecting each point 
with its image. Therefore, the function ® is associated to the mapping invariantly with respect 
to linear canonical changes of coordinates. 
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Appendix 10: Multiplicities of characteristic frequencies, 
and ellipsoids depending on parameters 


Several times in this course we have encountered families of ellipsoids in 
euclidean space. For example, in studying the dependence on parameters of 
characteristic frequencies of small oscillations, we encountered equipotential 
surfaces which were ellipsoids in euclidean space, depending upon the degree 
of rigidity of the system, (the metric of the space was defined by the kinetic 
energy). Another example was the ellipsoid of inertia of a rigid body (the 
parameter here was the shape of the rigid body and its distribution of mass). 

Here we will consider the general problem of describing the values of the 
parameter for which the spectrum of eigenvalues degenerates, i-e., the cor- 
responding ellipsoid becomes an ellipsoid of revolution. We note that the 
eigenvalues of a quadratic form on euclidean space (or the lengths of the axes 
of an ellipsoid) change continuously under continuous changes of the 
parameters of a system (the coefficients of the form). It seems natural to 
expect that in a system depending on one parameter, under changes of the 
parameter, at certain moments one of the eigenvalues would collide with 
another, so that for these values of the parameter the system would have a 
multiple spectrum. 

Suppose, for example, that we want to make the ellipsoid of inertia of a 
rigid body into an ellipsoid of revolution by movement of an adjustable mass 
along an arc rigidly attached to the body so that there is one parameter at 
our disposal. The three major axes a, b, and c will be continuous functions of 
this parameter, and at first glance it seems that for a suitable value of the 
parameter (p) we can achieve equality of two of the axes, say a(p) = b(p). It 
turns out, however, that this is not so, and that generally we need to attach 
at least two adjustable masses to make the ellipsoid of inertia an ellipsoid of 
revolution. 

In general, a multiple spectrum in typical families of quadratic forms is 
observed only for two or more parameters, while in one-parameter families 
of general form the spectrum is simple for all values of the parameter. Under 
a change of parameter in the typical one-parameter family, the eigenvalues 
can approach closely, but when they are sufficiently close, it is as if they 
begin to repel one another. The eigenvalues again diverge, disappointing the 
person who hoped, by changing the parameter, to achieve a multiple spec- 
trum. 

In this appendix we consider the reasons for this seemingly strange be- 
havior of the eigenvalues, and we discuss briefly analogous questions for 
systems with various groups of symmetries. 


A The manifold of ellipsoids of revolution 


Consider the set of all possible quadratic forms on the n-dimensional eucli- 
dean space R". This set has itself a natural structure of a vector space of 
dimension n(n + 1)/2. For example, the quadratic forms on the plane form a 
three-dimensional space (a form Ax? + 2Bxy + Cy? has as coordinates the 
three numbers 4A, B, and C). 
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The positive-definite forms form an open region in this space of all 
quadratic forms (for example, in the case of the plane this is the inside of one 
nappe of the cone B? = AC of degenerate forms). 

Every ellipsoid centered at the origin defines a positive-definite quad- 
ratic form, for which it is the level set of 1; conversely, the set of level 1 of any 
positive-definite quadratic form is an ellipsoid. We can therefore identify the 
sets of positive-definite quadratic forms and ellipsoids centered at the origin. 
In this way we give the set of ellipsoids with center 0 in R” the structure of a 
smooth manifold of dimension n(n + 1)/2 (this manifold is covered by one 
chart: a region in the space of quadratic forms). 

Now consider the set of all ellipsoids of revolution. We claim that this set 
has codimension 2 in the space under consideration, i.e., it is given by two 
independent equations, rather than one as it would seem at first glance. More 
precisely, we have 


Theorem 1. The set of ellipsoids of revolution is a finite union of smooth sub- 
manifolds of codimension 2 and higher in the manifold of all ellipsoids. 


The codimension of a manifold is the difference between the dimension 
of the ambient space and the dimension of the submanifold. 


ProoF. We first consider an ellipsoid in n-dimensional space which has two 
equal axes, and whose other axes are distinct. Such an ellipsoid is defined by 
the directions of the distinct axes, which gives 


_@+ Nn = 2) 


(n—1)+(n—2)4+---4+2 ; 


different parameters, and also by the magnitudes of the axes, which gives 
n — 1 parameters. Thus the total number of parameters is 


n?—n—2+4+2n—2 
2 b 


which is two less than the dimension of the space of all ellipsoids (which is 
n(n + 1)/2). This count of parameters also shows that the set of ellipsoids 
with exactly two equal axes is a manifold. 

As for ellipsoids with a larger number of equal axes, it is clear that they 
form a set of even smaller dimension. A rigorous proof follows from the 
following lemma. 


Lemma. The set of all ellipsoids with v, double, v; triple, v4 four-fold axes, etc. 
is a smooth submanifold of the manifold of all ellipsoids, with codimension 
2v, + S5vg + 9vg +--+ = Y Xi — 1)G + 2)yj. 
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The proof of this theorem reduces to the same kind of parameter count as 
in the special case analyzed above (which corresponds to v, = 1, v3 = 
V4 = ++: = 0). The reader can easily carry out this calculation, noting first 
that the dimension of the manifold of all k-dimensional subspaces in an n- 
dimensional vector space is equal to k(n — k) (since a k-dimensional plane in 
general position in an n-dimensional space can be thought of as the graph of 
a mapping from a k-dimensional space to an (n — k)-dimensional space, and 
such a mapping is given by a rectangular k x (n — k) matrix). 


EXAMPLE. Consider the case n = 2, i-e., ellipses in the plane. An ellipse is 
determined by three parameters (e.g., the lengths of the two axes and the 
angle giving the direction of one of them). Thus the manifold of ellipses in the 
plane is three-dimensional, as it must be by our formula. 

A circle, however, is determined by one parameter (the radius). Thus the 
manifold of circles in the space of ellipses is a line in a three-dimensional 
space, and not a surface as it would seem at first glance. 


This “paradox” becomes, perhaps, clearer from the following calculation. The quadratic 
forms Ax? + 2Bxy + Cx? with different eigenvalues forma submanifold of the three-dimensional 
space with coordinates A, B, and C, given by one equation A, — A, = 0, where A, 4(A, B, C) 
are the eigenvalues. However, the left-hand side of this equation is the sum of two squares, 
as is clear from the formula for the discriminant of the characteristic equation: 


A= (A +C) ~ 4(AC ~ B*) = (A — ©)? + 482. 


Thus the single equation A = 0 determines a line in the three-dimensional space of quadratic 
forms (A = C, B = 0), and not a surface. 


A simple consequence of the fact that the manifold of ellipsoids of revolu- 
tion has codimension 2 is that this manifold does not divide the space of all 
ellipsoids (and the manifold of quadratic forms with a multiple spectrum does 
not divide the space of quadratic forms), as a line does not divide a three- 
dimensional space. Therefore, we can assert not only that in an ellipsoid in 
“general position” all the axes share different lengths, but also that any two 
such ellipsoids can be connected by a smooth curve in the space of ellipsoids con- 
sisting entirely of ellipsoids with axes of different lengths. Furthermore, if two 
ellipsoids in general position are connected by a smooth curve in the space 
of ellipsoids which contains a point which is an ellipsoid of revolution, then 
by an arbitrarily small displacement of the curve we can remove it from the 
set of ellipsoids of revolution, so that on the new curve all the points will be 
ellipsoids without multiple axes. 

One consequence of what we have said is a simple proof of the theorem 
that characteristic frequencies increase when the rigidity of a system is 
increased. The derivative of a non-multiple eigenvalue of a quadratic form 
with respect to a parameter is determined by the derivative of the quadratic 
form in the corresponding characteristic direction. If the rigidity is increased, 
the potential energy increases in every direction, including the characteristic 
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directions. Thus the characteristic frequencies also increase. Hence we 
have proved the theorem on the growth of frequencies in the case when it 
is possible to go from the original system to a more rigid system, avoiding 
multiple spectra. The proof in the presence of multiple spectrum is now 
obtained by a passage to the limit, based on the fact that the interior of the 
path from the original system to the more rigid system can be removed by 
an arbitrarily small perturbation from the set of systems with multiple 
spectra. 

In summary, we can say that a typical one-parameter family of ellipsoids 
(or quadratic forms in euclidean space) does not contain ellipsoids of revolu- 
tion (quadratic forms with multiple spectra). Applying this to an ellipsoid 
of inertia we obtain the conclusion above about the necessity for two adjust- 
able masses. 

We turn now to two-parameter systems. It follows from our calculations 
that, in a typical two-parameter system, ellipsoids of revolution are en- 
countered only at isolated points of the parameter plane. 


Consider, for example, a convex surface in three-dimensional euclidean space. The second 
fundamental form of the surface determines an ellipse in the tangent space at every point. 
Therefore, we have a two-parameter family of ellipses (which can be translated to one plane 
by choosing a local coordinate system near a point on the surface). We come to the conclusion 
that, at every point of the surface except at certain isolated points, the ellipse has axes of different 
lengths. Therefore, on surfaces of general form, there are two orthogonal fields of directions (the 
major and minor axes of the ellipses) with isolated singular points. In differential geometry 
these directions are called the directions of principal curvature, and these singular points are 
called umbilical points. For example, on the surface of an ellipsoid there are four umbilical 
points: they lie on the ellipse containing the major and minor axes, and two of them are clearly 
visible in the picture of the geodesics on an ellipsoid (cf. Figure 207). 


In exactly the same way, in a typical three-parameter family, ellipsoids of 
revolution are encountered only on certain lines in the three-dimensional 
parameter space. For example, if at every point of three-dimensional eucli- 
dean space, we are given an ellipsoid (i.e. a symmetric two-index tensor), 
then the singularities of the fields of principal axes will be, in general, on 
certain lines (where two of the three fields of directions have discontinuities). 
These lines, like the umbilical points in the preceding example, are of several 
different types. Their classification (for typical fields of ellipsoids) can be 
obtained from the classification of singularities of lagrangian projections 
given in Appendix 12. 

In a typical four-parameter family, ellipsoids of revolution occur on two- 
dimensional surfaces in the space of parameters. These surfaces have no 
singularities other than transverse intersections at isolated points of the 
parameter space; these values of the parameters correspond to ellipsoids 
with two (different) pairs of equal axes. 

Triple axes appear first for five parameters, at isolated points of the param- 
eter space. The values of the parameters corresponding to ellipsoids with a 
double axis form a three-dimensional manifold in the five-dimensional 


428 


Appendix 10: Multiplicities of characteristic frequencies, and ellipsoids 


parameter space with two types of singularities: transversal intersections of 
two branches along some curve and conic singularities at isolated points (not 
lying on this curve), i.e., at points of the parameter space corresponding to 
ellipsoids with three equal axes. These conic singularities have the following 
structure: by intersecting the three-dimensional manifold of ellipsoids of 
revolution with a four-dimensional sphere of small radius with center at the 
singular point, we obtain two copies of the projective plane. The resulting em- 
beddings of the projective plane in the four-dimensional sphere are diffeo- 
morphic to the embedding given by the five spherical harmonics of degree two 
on the two-dimensional sphere (five linear combinations of the functions x; x;, 
orthonormal in the space of functions on the sphere x? + x3 + x3 = 1, 
orthogonal to the identity, give an even mapping of S? into S* and, therefore, 
an embedding RP? — S*). 

It remains to describe the behavior of the eigenvalues of a quadratic form 
in a typical two-parameter family as the parameter approaches a singular 
point where the two eigenvalues coincide. A little calculation shows that the 
graph of the pair of eigenvalues we are considering has, over the plane of 
parameters near the singular point, the form of a two-sheeted cone, whose 
vertex corresponds to the singular point, and each of its nappes to one of the 
eigenvalues (Figure 243). 


_ 


Figure 243 Characteristic frequencies of one- and two-parameter families of oscil- 
lating systems of general form 


A typical one-dimensional subfamily of our two-dimensional family has 
the form of a curve in the plane of parameters which does not pass through 
any singular points. Every one-parameter family which contains a singular 
point can be removed from it by a small perturbation; the resulting one- 
parameter family will be a curve in the space of parameters passing near the 
singular point. The graph of the eigenvalues over a curve on the plane of 
parameters passing near a singular point consists of those points of the cone 
which project onto this curve. Therefore, this graph near the singular point is 
close to a hyperbola, resembling a pair of intersecting straight lines (a pair of 
straight lines would be obtained if our one-parameter family passed through 
the singular point). 
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This discussion of eigenvalues of two-parameter systems of quadratic 
forms explains the strange behavior of characteristic frequencies when a 
single parameter is varied: in general (except for completely singular cases), 
when a single parameter is varied the characteristic frequencies can approach 
one another but cannot collide; after approaching, they must again go off in 
different directions. 


B Application to the study of oscillations of continuous media 


The general argument above has numerous applications in the study of the 
dependence on parameters of the characteristic frequencies of various 
mechanical systems with finitely many degrees of freedom; however, the most 
interesting applications may be to systems with infinitely many degrees of 
freedom, describing oscillations of continuous media. These applications are 
based on the fact that the codimensions of manifolds of ellipsoids with given 
multiplicities of axes are determined by these multiplicities and do not depend 
on the dimension of the space. 

For example, the codimension of the set of ellipsoids of revolution in the 
manifold of all ellipsoids is equal to two in a space of any dimension; there- 
fore, it is natural to assume that in the infinite “manifold” of ellipsoids in 
infinite-dimensional hilbert space, the set of ellipsoids of revolution has 
codimension 2 (and, in particular, the space of ellipsoids without multiple 
axes is connected). 

Of course, arguments of this kind need rigorous justification. We will not, 
however, occupy ourselves with this, but we will see what conclusions follow 
from the argument above if we apply it to the problem of oscillations in 
continuous media. 

The kinetic energy of a continuous medium filling a compact region D is 
expressed in terms of the deviation u of a point x from equilibrium by the 
formula 


T= 4 uP dx 
D 


For definiteness, we can take the medium to be a membrane (in this case the 
region D is two-dimensional, and the deviation u one-dimensional). The 
kinetic energy defines a euclidean structure on the configuration space of the 
problem (i.e., in the space of functions uv). The potential energy is given by the 
Dirichlet integral 


(from the mathematical point of view these data constitute the definition of 
the membrane). 

The squares of the characteristic frequencies of the membrane are the 
eigenvalues of the quadratic form U on the configuration space, whose metric 
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is defined using the kinetic energy. We assume that a typical membrane cor- 
responds to a typical quadratic form (this assumption means transversality of 
the manifold of quadratic forms corresponding to different membranes to 
the manifold of forms with multiple eigenvalues). If we believe in this prop- 
erty of general position, we come to the following conclusions. 


1. For membranes in general position, all the characteristic frequencies are 
different. We can go from one membrane in general position to another 
by a continuous path consisting entirely of membranes with simple 
spectra. Furthermore, a typical path connecting any two membranes does 
not contain even one membrane with a multiple spectrum (except, 
possibly, the ends of the path). 

2. By varying two parameters of the membrane we can make two character- 
istic frequencies coincide; to obtain a triple frequency, we must have at 
our disposal five independent parameters; for a four-fold frequency we 
need ten parameters, etc. 

3. If, by starting from a membrane with a simple spectrum and continuously 
deforming it, we pass to another membrane with a simple spectrum along 
any path in general position, then as a result, the k-th largest characteristic 
frequency of the second membrane is always obtained independently of 
the path of deformation from the k-th largest characteristic frequency of 
the original membrane; continuations of characteristic functions, however, 
do generally depend on the path of deformation (i.e., by changing the path, 
the sign of the resulting characteristic function can be changed). 

In particular, if by starting from a membrane with a simple spectrum 
and deforming it we describe a closed path in the space of membranes and 
return to the original membrane, bypassing the set of membranes with 
multiple spectra (which has codimension 2), then the k-th characteristic 
frequency returns to its original value, while the k-th characteristic func- 
tion may change sign. [Editor’s note: Conclusions like this have been 
proven by K. Uhlenbeck (Amer. J. Math. 98 (1976), 1059-1078). ] 


C_ The effect of symmetries on the multiplicity of the spectrum 

A multiple spectrum is the exception in systems of general form, but it 
is not removable under small perturbations in cases when the given system 
is symmetric and the deformations preserve the symmetry. 

Consider, for example, a system of three identical masses at the vertices 
of an equilateral triangle, connected to one another and to the center of the 
triangle by identical springs, and capable of moving in the plane of the 
triangle. The system has rotational symmetry of order 3. Therefore, there 
is a linear operator g acting on the configuration space (which has dimension 
6), whose third power is equal to 1 and which leaves invariant both the 
euclidean structure of the configuration space and the ellipsoid in the con- 
figuration space giving the potential energy. 

It follows that this ellipsoid must be an ellipsoid of revolution. If we let 
g be the indicated operator on the configuration space and é a vector on the 
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major axis of the ellipsoid, then the axis in the direction gé is also a major 
axis (since the rotation g takes the ellipsoid to itself). 

There are two possibilities for the vector gé: either gé = €¢, or the vectors 
€ and gé are linearly independent. In the second case, the plane spanned by 
the vectors é and gé consists entirely of major axes. Therefore, the eigenvalues 
corresponding to these axes are at least double. The space spanned by the 
three vectors €, gé, and g?€ is invariant under g. It is either two dimensional 
(in which case g acts by a 120° rotation) or three dimensional (in which case 
g acts by the same rotation around € + gé + g?éasan axis). In the latter case, 
we may choose the direction of this sum for one of the principal axes of the 
ellipsoid, with the two other principal axes in the three-dimensional space 
perpendicular to it. It is therefore possible to choose the principal axes for an 
ellipsoid which is invariant under an orthogonal transformation of order three 
(in a space of any number of variables), so that each axis is either fixed under 
the transformation or is rotated by 120° in an invariant plane spanned by it 
and another axis (orthogonal to it, as well as to all other axes) of the same 
length. In what follows, we shall assume that the axes of ellipsoids and the 
directions of the corresponding characteristic oscillations have been chosen 
in the manner just described. 

Our argument shows that characteristic oscillations of a system with 
third-order rotational symmetry can be of two types: those invariant under 
rotation by 120° (gé = €) and those passing under such a rotation to inde- 
pendent characteristic oscillations with the same frequency (gé and ¢ indepen- 
dent). In the second case, there actually arise three forms of characteristic 
oscillations with the same frequency (¢, gé, and g”¢), but only two of them are 
independent: 


E+gE+g7E=0 


since the sum of three vectors of equal length on the plane forming angles of 
120° is equal to zero. 

The number of characteristic oscillations of our system is generally equal 
to 6. To find out how many of them are of the first (symmetric) and second 
(nonsymmetric) type, we can use the following argument. Consider the 
limiting case, when each of the masses oscillates independently from the 
others. In this case, we can choose an orthonormal basis of the configura- 
tion space consisting of six characteristic oscillations, two for each point, for 
which that point moves and the other two do not. We denote by ¢,; and 
n; the characteristic vectors corresponding to the i-th point with charac- 
teristic frequencies a and b, respectively, and let x;, y; be coordinates in the 
orthonormal basis ¢;, 4;. Then the potential energy can be written in the 
form 


U = 4(a?xi + b?y?) + 3(a?x3 + b?y3) + 3(a?x§ + b’y3). 
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The symmetry operator g permutes the coordinate axes: 


951 = 2 952 = 63 963 = 41, 
gn, = "2 gn2 = 13 9n3 = ™|- 


We can now represent our six-dimensional space as the orthogonal direct 
sum of two straight lines and two two-dimensional planes, invariant under 
the symmetry operator g. That is, the invariant lines are defined by the 
directions of the vectors 


é1+¢,+¢3 and m+2.+73, 


and the invariant planes are their orthogonal complements in the spaces 
spanned by the vectors €; and ;, respectively. The first straight line is the 
direction of a symmetric characteristic oscillation with frequency a, and the 
second the direction of one with frequency b. In exactly the same way, every 
vector in the first plane is a direction of characteristic oscillation with fre- 
quency a which, under rotation by 120°, goes to an independent oscillation 
of the same frequency; for all vectors in the second plane, the oscillation is 
also not symmetric, with frequency b. 

Thus, in this degenerate case of three independent points, there are two 
independent characteristic oscillations of symmetric type, and four un- 
symmetric, of which the latter are divided into two pairs. In each pair the 
oscillations have the same eigenvalue and are obtained from one another by 
rotation of the plane of our points by 120°. 

We now claim that the conclusion above holds true for any law of inter- 
action between our points if the interaction is symmetric, i.e., if the potential 
energy of the system is preserved under rotation of the plane by 120°. 

In fact, decompose the 6-dimensional configuration space into an ortho- 
gonal sum of the plane of invariant vectors of g and of its orthogonal comple- 
ment. The potential energy will decompose into a sum of two quadratic 
forms-—one in two variables, the other in four. Now consider characteristic 
oscillations in the two-dimensional and four-dimensional configuration 
spaces, with potential energy described above. The four-dimensional space 
decomposes into two g-invariant planes, orthogonal in the potential energy 
metric. We have obtained a system of six characteristic oscillations having 
the required properties. 

Thus, in a system in general form of three points in the plane with rotational 
symmetry of order 3, there are four different characteristic frequencies, two 
of which are simple and two double. Each of the simple characteristic fre- 
quencies corresponds to a symmetric characteristic oscillation, and each of 
the double ones to three characteristic oscillations obtained from one another 
by rotation by 120° and summing to zero (so that only two of them are 
independent). 
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PROBLEM. Classify the characteristic oscillations of a system with the symmetries of an equilateral 
triangle (allowing not only rotation by 120°, but also reflection through the altitude of the 
triangle). 


PROBLEM. Classify the characteristic oscillations of a system whose group of symmetries is the 
group of 24 rotations of the cube. 


ANSWER. The oscillations will be of five types. By rotations, from each oscillation one can obtain 
systems of 8, or 6, or 4, or 2, or 1 independent oscillations (in the last case the oscillations are 
entirely symmetric). 


Remark. To classify oscillations in systems with any group of symmetries, a special apparatus 
has been developed (the so-called theory of group representations). Cf., for example, Michael 
Tinkham, Group Theory and Quantum Mechanics, McGraw-Hill, 1964. 


D The behavior of frequencies of a symmetric system under a 
variation of parameters preserving the symmetry 


We assume now that our symmetric system depends in a general way on some 
number of parameters, and that the symmetry is not disturbed when the 
parameters are varied. Then the characteristic frequencies of various multi- 
plicities will also depend on the parameters, and the question arises of when 
the characteristic frequencies will collide. We will confine ourselves to 
formulating a result for the simplest case of systems with third-order rota- 
tional symmetry (for rotational symmetry of any order n > 3, the answer is 
the same). The details can be found in the following articles: V. I. Arnold, 
Modes and quasi-modes, Functional Analysis and Its Applications, 6:2 
(1972), 94-101; V. N. Karpushkin, The asymptotic behavior of the eigen- 
values of symmetric manifolds and the “most probable” representations of 
finite groups, Moscow Univ. Math. Bull. 29 (1974), no. 2, 136-139. 

Characteristic oscillations of any system with rotational symmetry of 
order 3 are divided into two types: symmetric oscillations, and oscillations 
carried by rotation by 120° into independent ones. For a general system with 
third-order rotational symmetry (without, in particular, any additional 
symmetry) all the characteristic frequencies of the first type are simple, and 
of the second, double. In addition, it turns out that if a system depends in a 
general way on one parameter and is symmetric for all values of the param- 
eter, then under variation of the parameter, the characteristic frequencies of 
symmetric oscillations do not collide with one another, and the double 
characteristic frequencies of asymmetric oscillations do not split. In addition, 
the double characteristic frequencies of asymmetric oscillations do not 
collide with one another under a change of parameters. However, the char- 
acteristic frequencies of symmetric and asymmetric oscillations move under 
changes of parameter independently from one another, so that for discrete 
values of the parameter the characteristic frequency of a symmetric oscilla- 
tion and the (double) characteristic frequency of an asymmetric oscillation 
can collide (and pass through one another). 
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In order to make two characteristic frequencies of symmetric oscillations 
collide, we must vary at least two parameters; and to make two characteristic 
frequencies of asymmetric oscillations collide we must vary at least three. 

In general, in the typical family of systems with third-order rotational 
symmetry, for the collision of i simple characteristic frequencies (i symmetric 
oscillations) and j double frequencies (j unsymmetric oscillations) to occur, 
the number of parameters of the family must be at least 


(i - * + 2) + 7. 


We apply this to oscillations of symmetric membranes. Here we will 
assume that the membrane is of general form, admits rotation by 120°, and 
corresponds to an ellipsoid of general form in the space of ellipsoids of the 
configuration space admitting the transformation of the configuration space 
induced by the rotation of the membrane. 


The exact formulation of this assumption is that, for all membranes except a set of infinite 
codimension, the mapping from the space of symmetric membranes into the space of symmetric 
ellipsoids is transverse to each of the manifolds of ellipsoids with a given number of multiple 
axes. 


If we agree to this assumption, we come to the following conclusions about 
oscillations of symmetric membranes. 


1. For membranes of general form admitting rotation by 120°, asymp- 
totically one-third of the characteristic frequencies (counting them with 
multiplicities) are simple, and the corresponding characteristic oscilla- 
tions admit rotation by 120°. The remaining characteristic frequencies are 
double; each double characteristic frequency corresponds to three eigen- 
functions whose sum is zero and which are taken to one another under 
rotation by 120°. 

2. In general one-parameter families of such symmetric membranes, 
for isolated values of the parameters there are collisions of a single fre- 
quency with a double frequency, but there are no collisions of single 
frequencies with one another or collisions of double frequencies with one 
another. 

3. The minimal number of parameters of a family of membranes for which 
more complicated collisions of characteristic frequencies are realized 
(stably with respect to small perturbations preserving the symmetry) is 
given by the formula 


x ae +P] Vijs 


where v,; is the number of points of collision of i single and j double 
frequencies. 
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In particular, for a typical smal! deformation of a circular membrane 
preserving rotational symmetry of order 3, a third of the eigenvalues 
(corresponding to eigenfunctions with azimuthal part cos 3k@ and 
sin 3kg) immediately disperse. Under further one-parameter deforma- 
tion the simple and double characteristic frequencies can pass through one 
another, but two simple or two double frequencies cannot collide with one 
another. 


E Discussion 


The value of the concepts of general position and symmetry lies, in particular, 
in the fact that they allow us to obtain some information in those cases 
where we cannot find an exact solution of a problem. In particular, for 
almost no membranes do we know the forms of the characteristic oscillations. 
Nevertheless, from general arguments we can say something, for example, 
about the multiplicities of eigenvalues. 

The study of high-frequency oscillations of continuous media is very 
important in many fields (optics, acoustics, etc.), and special methods have 
been developed for approximate determination of the form of character- 
istic oscillations. One of these methods (called the method of quasi-classical 
asymptotics) consists of seeking an oscillation which is locally close to a 
simple harmonic wave of short length, but which changes its amplitude and 
the direction of its front from point to point. 

Analysis (which we will not go into here) shows that in some cases we 
can construct approximate solutions, with the indicated properties, of the 
equation for eigenfunctions. They are approximate solutions in the sense 
that they almost satisfy the equation for eigenfunctions (not in the sense that 
they are close to real eigenfunctions). 

In particular, if the membrane has the form of an equilateral triangle with 
smoothed and strongly blunted corners, then we can construct an approxi- 
mate solution of the type described which differs appreciably from zero only 
in a neighborhood of one of the altitudes of the triangle. (Physicists call this 
approximate solution the wave analogue of a beam moving along the altitude 
of the triangle; this beam is a stable’'® trajectory on a billiard table having 
the shape of our membrane; c.f. the following appendix on short wave 
asymptotics). 

It follows from symmetry and general position arguments that typical 
membranes with rotational symmetry of third order have no real character- 
istic oscillations of the type described. Assume that one of the characteristic 


115 The condition for linear stability of a billiard trajectory has the form 
(ry try — Dir — Dr. — D> 9, 
where | is the length of the interval of the trajectory and r, and r, are the radii of curvature of 


the walls at its ends. 
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oscillations of the membrane is concentrated near an altitude (but not near 
the center of the membrane). Then, rotating it by 120° and 240° we obtain 
three characteristic oscillations with the same characteristic frequency. These 
three oscillations are independent (this follows from the fact that their sum 
is not zero). Therefore, the characteristic frequency has multiplicity 3, which 
does not occur in typical systems with third-order rotational symmetry. 

From this argument it is clear that attempting to construct rigorous high- 
frequency asymptotics for eigenfunctions is a rather hopeless task; what we 
can hope to do is to obtain approximate formulas for almost characteristic 
oscillations. Such an almost characteristic oscillation can differ very strongly 
from real characteristic oscillations, but if we give the membrane the initial 
condition corresponding to it, then for a long time the oscillation will resemble 
a standing wave (characteristic oscillation). 

An example of an almost characteristic oscillation is the motion of one 
of two identical pendulums connected by a very weak spring. If, at the initial 
moment, we set the first pendulum in motion and leave the second fixed, then 
for a long time it will appear that only the first pendulum is oscillating, and 
the oscillation will be almost characteristic. For true characteristic oscilla- 
tions, both pendulums oscillate with the same amplitude. 


The problem of connecting the geometry of a membrane with the properties of its character- 
istic oscillations has been intensively studied in recent years by many authors (including H. Weyl, 
S. Minakshisundaram and A. Pleijel, A. Selberg, J. Milnor, M. Kac, I. Singer, H. McKean, 
M. Berger, Y. Colin de Verdiére, J. Chazarain, J. J. Duistermaat, V. F. Lazutkin, A. I. Shnirel’man, 
and S. A. Molchanoy). 

To the simplest question, “Can you hear the shape of a drum?” the answer turns out to be 
negative: there exist non-isometric riemannian manifolds with the same spectrum. On the 
other hand, several properties of a manifold can be recovered from the eigenvalues of the laplacian 
and from the properties of eigenfunctions (for example, the complete set of lengths of closed 
geodesics can be recovered). 
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From the point of view of physical optics, the description of the propagation 
of light in geometric optics, using rays (i.e., Hamilton’s canonical equations) 
or wave fronts (i.e., the Hamilton-Jacobi equation), is only an approximation. 
According to the ideas of physical optics, light is electromagnetic waves, 
and geometric optics is a first approximation, a good description of 
phenomena only when the length of the waves is small compared to the size 
of the objects being considered. 

A mathematical version of these physical ideas consists of asymptotic 
formulas for solving the corresponding differential equations—formulas 
which give better approximations for higher-frequency oscillations (i.e., for 
shorter waves). These asymptotic formulas can be written in terms of rays 
(i.e., motions in some hamiltonian dynamical system) or fronts (i.e., solutions 
of the Hamilton-Jacobi equation). 

Similar short wave asymptotics exist for solutions of many equations in 
mathematical physics, describing all wave processes. In different areas of 
physics and mathematics they are connected with different names. For 
example, in quantum mechanics, short wave asymptotics are called quasi- 
classical approximations; they are determined by the so-called WK BJ method 
(Wentzel, Kramers, Brillouin, Jeffreys), although these approximations were 
used much earlier by Liouville, Green, Stokes, Rayleigh and others. 

The construction of short wave asymptotics is based on the idea that, 
locally, a series of almost strictly sinusoidal waves is observed at each place, 
although the amplitudes of these waves and the directions of their fronts 
change slowly from point to point. Formal substitution of a function of this 
form into the partial differential equations describing the wave process 
reduces us (in a first approximation for waves of small length) to the 
Hamilton-Jacobi equation for wave fronts. The higher-order approximations 
allow us to determine as well the dependence of the amplitude of oscillation 
on the point. 

Of course, the entire procedure requires a mathematical foundation. The 
exact formulation and proof of the corresponding theorems are not at all easy. 
Particular difficulty is introduced by “caustics” (ie., focal or conjugate 
points, or turning points). 

Caustics are envelopes of families of rays; they can be seen on a wall 
illuminated by rays reflected from some smooth curved surface. If the rays 
orthogonal to the wave fronts intersect and form caustics, then near the 
caustics the formulas for short wave asymptotics must be slightly changed. 
Namely, the phase of oscillations along each ray undergoes a standard dis- 
continuity (one-fourth of a wave) upon each passage of the ray through a 
caustic. 

A precise description of all these phenomena may be conveniently devel- 
oped in terms of the geometry of lagrangian submanifolds of the correspond- 
ing phase space and their projections onto the configuration space. Here, 
caustics are interpreted as singularities of the projection, from phase space 
to configuration space, of that lagrangian manifold which represents a 
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family of rays. Thus, the normal forms of singularities of lagrangian pro- 
jections introduced in Appendix 12 supply a classification of singularities of 
caustics formed by systems of rays in “general position.” 

In this appendix we introduce (without proof) the simplest formulas of 
short wave asymptotics for the Schrodinger equation of quantum mechanics. 
A more detailed exposition can be found in the following places: 


J. Heading, Introduction to phase integral methods, Methuen Co. Ltd., 1962. (Cf. especially 
Appendix II (by V. P. Maslov) in the Russian translation of Heading’s book, Moscow 1965). 


V. P. Maslov, Théorie des perturbations et méthodes asymptotiques, Pairs, Dunod, 1972 (Russian 
edition: Moscow University, 1965). 


V.I. Arnold, Ona characteristic class entering into conditions of quantization, Functional Analy- 
sis and its Applications, v. I (1967). 
L. Hérmander, Fourier integral operators, Acta Math. 127 (1971), 79-183. 


A Quasi-classical approximation for solutions 
of Schrédinger’s equation 


Schrodinger’s equation for a particle in a field with potential energy U in 
euclidean space is an equation for a complex-valued function (q, t): 


h2 
a Av t+ U@y, = qeR eR. 


Here, h is some real constant which is also a small parameter of the problem 
being considered, and A is the Laplace operator. 
We assume that the initial condition has the short wave form 


Wir=o = eget, 


where the smooth function @ is nonzero only inside some bounded region. 
We will find below an asymptotic (as h > 0) formula for the solution of 
Schrédinger’s equation with such an initial condition. 

First of all, we consider the motion of a classical particle in the field with 
potential energy U, i.e., we consider Hamilton’s equations 


i-Z p--5. where H = 4p? + U(q) 
in 2n-dimensional phase space. The solutions of these equations determine 
a phase flow (under some conditions on the potential, which we assume ful- 
filled; these conditions prevent the particle from going off to infinity in a 
finite time). 

We associate to our short wave initial condition a lagrangian submanifold 
of the phase space (i.e., a manifold whose dimension is equal to the dimension 
of the configuration space and on which the 2-form dp dq defining the sym- 
plectic structure on the phase space is identically zero). Namely, we define 
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the “momentum” corresponding to our initial condition as the gradient of 
the phase, i.e., we set 


CS 
p(q) = aq 


Lemma. For any smooth function s, the graph of the function p(q) constructed 
by it in the phase space R?" = {(p, q)} is a lagrangian manifold. Conversely, 
if a lagrangian manifold projects diffeomorphically onto the q-space (i.e., it 
is a graph), then it is given by some generating function s, according to the 
formula above. 


We denote the lagrangian manifold constructed from the initial condition 
(with the function s) by M. After time t the phase flow g‘ carries the manifold 
M to another manifold g'M. This new manifold is also lagrangian, since the 
phase flow preserves the symplectic structure. 

For small t, the new lagrangian manifold, like the old, projects diffeo- 
morphically onto the configuration space. However, for large t this is not 
necessarily true (Figure 244). 


Figure 244 Transformation of lagrangian manifolds by the phase flow 


In other words, several points of the new lagrangian manifold may project 
to one point Q of the configuration space. We assume that there are only 
finitely many of these points and that they are all nondegenerate (i.e., that at 
each of the points of the new lagrangian manifold which project onto Q, the 
derivative of the projection mapping onto the configuration space is non- 
degenerate). 


The nondegeneracy condition is satisfied for almost all points Q. Those exceptional points Q 
for which it is not satisfied form a set of measure zero in the configuration space. In the general 
case, this set is a surface whose dimension is one less than the dimension of the configuration 
space. This surface, playing the role of a caustic in our problem, can itself have complicated 
singularities. 
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The points of the new lagrangian manifold projecting to the point Q arose 
under the phase flow transformation from several points of the original 
lagrangian manifold (constructed from the initial condition). In other words, 
after time t, several trajectories of classical particles, with initial conditions 
belonging to the original lagrangian manifold, arrive at Q. 

We let (p;, q;) denote these initial points in the phase space, and S; the 
action along the trajectories of the phase flow coming from the point (p;, q;). 
More precisely, we set 


SQ, t) = s(qj) + [u dé, 


22; 


where L = s — U(q) and 9°(p;, 4;) = (p(@), 4(9)). 


Then, as h — 0, the solution of Schrédinger’s equation with the oscillating 
initial condition given by the functions s and ¢ has asymptotic form 


—1/2 
eH h)Si(Q, t)— (in/2) uj + O(h), 


H(2.0) = ¥ olay] 5 


where ju; is an integer (the Morse index) which will be defined below. 

In order to explain this formula, we first consider the case when the time 
interval t is small. In this case, the sum is reduced to a single term, since the 
lagrangian manifold obtained from the original lagrangian manifold by the 
phase flow transformation after small time projects diffeomorphically onto 
the configuration space. In other words, of the family of particles correspond- 
ing to the initial condition for Schrédinger’s equation, only one arrives at 0 
after the small time t. 

For small t, the Morse index is equal to zero (as we will see below from its 
definition). In this way the function w(Q, t) has, like the initial condition, a 
rapidly oscillating form. Thus, the function S defining the wave fronts at time 
t is none other than the value at time t of the solution of the Hamilton-Jacobi 
equation, the initial condition for which is given by the function s defining 
the wave front at the initial moment. The amplitude of the wave at time t at 
the point Q is obtained from the amplitudes, at the initial moment at the 
original point, of the trajectories coming to Q multiplied by a certain factor. 
This factor is chosen so that, under motions of the particles corresponding 
to our initial conditions, the integral of the square of the modulus of the 
function yw, over a region of configuration space filled with particles, does not 
change with time. (Here we assume that at the initial moment, some region in 
the configuration space has been selected; then the phase points on the 
original lagrangian manifold are selected whose projections onto the con- 
figuration space lie in this region; their images under the action of the phase 
flow after time t are found; finally, the projections of these images onto the 
configuration space form the region “filled with particles at time t.”) 
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B The Morse and Maslov indices 


The number y; is defined as the number of focal points to the manifold M 
on the interval [0, t] of the phase curve starting out at the point (p;, q)). 

Focal points to the manifold M are defined as follows. We chose the point 
Q so that, under projection of the lagrangian manifold obtained from M at 
time t, a nondegeneracy condition is satisfied at this point. However, if we 
consider the entire phase curve coming from the point (p,, q;), then at some 
moments of time @‘between 0 and t, the nondegeneracy condition may not be 
satisfied at the point (p(0), q(0)) of the lagrangian manifold g°M. Such points 
are called focal points to the manifold M along this phase curve. 


We note that the definitions of focal points to M and the Morse index do not depend on 
Schrédinger’s equation, but relate simply to the geometry of the phase flow in the cotangent 
bundle to the configuration space (or to the calculus of variations, which is the same thing). 

In particular, as our lagrangian manifold M we may take the fiber of the cotangent bundle 
passing through the point (po, qo) (given by the condition g = qo). In this case a focal point to 
M on the phase curve going out from (po, qo) is called conjugate to the original point (more 
precisely, the projection of this focal point onto the configuration space is said to be conjugate 
to the point qo along the extremal in the configuration space starting at qg with momentum po). 
In the even more special case of motion along a geodesic on a riemannian manifold, a focal 
point to a fiber of the cotangent bundle is called conjugate to the initial point of the geodesic 
along this geodesic. For example, the south pole of a sphere is conjugate to the north pole along 
any meridian. 

The Morse index of an interval of a geodesic, equal to the number of points conjugate to the 
initial point, plays an important role in the calculus of variations. Namely, we consider the 
second differential of the action as a quadratic form on the space of variations (with fixed end- 
points) of the geodesic we are studying. Then the index of inertia of this quadratic form is equal 
to the Morse index (cf., for instance, J. Milnor, Morse Theory, Princeton University Press, 1967). 

Thus the geodesic, up to the first conjugate point, is a minimum of the action, which justifies 
the name “principle of least action” for various variational principles of mechanics. 


We note that in calculating the Morse index, the focal points must be 
counted with multiplicity (the multiplicity of a focal point in general position 
is equal to 1). 

The Morse index is a particular case of the so-called Maslov index, which 
is defined independently of the phase flow for any curve on a lagrangian mani- 
fold of the cotangent bundle over the configuration space. 

Consider the projection of our n-dimensional lagrangian manifold onto 
the n-dimensional configuration space. This is a smooth mapping of mani- 
folds of the same dimension. It can have singular points, 1.e., points at which 
the rank of the derivative mapping drops, and in a neighborhood of which 
the projection is not a diffeomorphism. 

It turns out that in general the set of singular points has dimension n — 1 
and consists of the union of a smooth manifold of dimension n — 1 made up of 
simple singular points at which the rank drops to 1, and a finite set of mani- 
folds whose dimensions are n — 3 and smaller. Here, “in general” means that 
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these properties can be attained by an arbitrarily small perturbation of the 
lagrangian manifold, under which it remains lagrangian. 


We should point out that, among the pieces of various ranks into which the set of singular 
points is divided, there is no piece of dimension n — 2. After the simplest singular points, forming 
a manifold of dimension n — 1, there are the points where the rank drops by two; they form a 
manifold of dimension » — 3. The projection of the set of singular points onto the configuration 
space (the caustic) consists, in general, of pieces of all dimensions from 0 to n — | without 
omissions. 


Furthermore, it turns out that the (n — 1)-dimensional manifold of the 
simplest singular points is two-sided in the lagrangian manifold; that is, we 
can coordinate the orientations of the normals at all points in the following 
way. 

Consider some simple singular point on the lagrangian manifold. We take 
a system of coordinates q,,..., q, in a neighborhood of the projection of this 
point onto the configuration space. Let p,, ..., p, be corresponding coordi- 
nates in the fiber of the cotangent bundle. In a neighborhood of our singular 
point, we can consider the lagrangian manifold as the graph of the vector 
function (q;, P2,---, Pn) of the variables (p,, q2,.-., q,) (or a vector function 
of an analogous form in which the role of the distinguished coordinate is 
played not by the first coordinate but by any of the remaining coordinates). 

Singular points near the given one are then defined by the condition 
0q,/0p, = 0. For lagrangian manifolds in general position, this derivative 
changes sign upon passing from one side of the manifold of singular points to 
the other in our neighborhood of the simple singular point. We will call the 
side where this derivative is positive the positive side. 


We note that it is necessary to prove that the definitions of positive direction near different 
points agree with one another. Furthermore, it must be shown that the positive direction near 
one point is well defined, i.e., does not depend on the coordinate system. All this can be done by 
direct calculations (cf. the article cited above in “Functional Analysis”). For further development 
of these ideas, see V. I. Arnold, Sturm theorems and symplectic geometry, Funct. Anal. Appl. 
19 (1985). 


Now the Maslov index of an oriented curve on a lagrangian manifold is 
defined as the number of passages from the negative side of the manifold of 
singularities to the positive side, minus the number of passages in the other 
direction. In this we assume that the ends of the curve are nonsingular and 
that the curve intersects only the manifold of simple singular points and only 
with nonzero angles. Having defined the index for such curves, we can define 
it for an arbitrary curve connecting two nonsingular points: to do this it is 
sufficient to approximate the curve by one which intersects only the manifold 
of simple singular points and only with nonzero angles. It can be shown that 
the index does not depend on the choice of the approximating curve. 


PRoBLEM. Find the index of the circle p=cost, q=sint oriented by the parameter t, 
0 <t < 2n, in the lagrangian manifold p? + q? = 1 of the phase plane. 


ANSWER. + 2. 
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Finally, the Morse index of a phase curve in R*" can now be defined as the 
Maslov index of a curve in an (n + 1)-dimensional lagrangian manifold in a 
suitable (2n + 2)-dimensional phase space. As coordinates in this space we 
will take (po, P3 Go, 4) (where (p, q) € R?”). If we set qg = tand pp = — H(p,q), 
and let the point (p, q) range over the n-dimensional lagrangian manifold in 
R2" obtained from the original after time ¢ by the action of the phase flow, 
then under change of t the points in R?"*? form an (n + 1)-dimensional 
lagrangian manifold. The graph of the motion of a phase point under the 
action of the phase flow can be considered as a curve on this (n + 1)-dimen- 
sional lagrangian manifold. We can verify that the Maslov index of this graph 
agrees with the Morse index of the original phase curve. 


C Indices of closed curves 


The indices of closed curves on lagrangian submanifolds of a linear phase 
space can also be calculated with the help of a complex structure. In addition 
to the symplectic structure dp \ dq on the linear phase space R" = {(p, q)}, 
we introduce a euclidean structure (with scalar square p* + q?) and a 
complex structure, in which multiplication by i is 


1: RR" (p,q) =(-Gp) 2 =ptiq C= {z}. 
All three structures are connected by the relation 


[x, y] = Ux, y), 


where the square brackets denote the skew-scalar product. 

Linear transformations of the phase space preserving any two (and, 
therefore, all three) structures are called unitary transformations. Such trans- 
formations take lagrangian planes to lagrangian planes. 

Every lagrangian plane can be obtained from any other (e.g., from the 
real plane R" given by the equation q = 0) by a unitary transformation. In 
addition, any two unitary transformations A and B carrying the real plane 
to the same lagrangian plane differ by a unitary transformation which is a 
real orthogonal transformation: 


B= AC, where CR" = R’. 


Conversely, any preliminary orthogonal transformation does not change the 
image of the plane under the action of a unitary transformation. 

We now note that the determinant of an orthogonal transformation is 
equal to +1. Therefore the square of the determinant of a unitary transforma- 
tion carrying the real plane to a given lagrangian plane depends only on the 
lagrangian plane itself and does not depend at all on the choice of unitary 
transformation. 

After these preliminary remarks we return to our lagrangian manifold 
and closed oriented curve lying in it. At every point of the curve, there is a 
plane tangent to the lagrangian manifold in the symplectic vector space. The 
square of the determinant of the unitary transformation carrying the real 
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plane to this tangent plane is a complex number with modulus one. As a 
point moves along our closed curve, this complex number changes. After an 
entire circuit of the curve, the square of the determinant makes some integral 
number of rotations around the origin on the plane of complex variables, 
oriented from 1 to i. This integer is the index of the closed curve. 

The indices of closed curves enter into asymptotic formulas for stationary 
problems (characteristic oscillations). Assume that the phase flow cor- 
responding to the potential U has an invariant lagrangian manifold lying on 
the energy level H = E. Then the equation 


Aw = A*(U(q) — Ew 


has a series of eigenvalues Ay > 00 with asymptotic form Ay = fy + O(uy*) 
if, for every closed contour y on the lagrangian manifold, we have the con- 
gruence 


2 . 
“EN $ p dq = ind » (mod 4). 
y 
In the one-dimensional case, the lagrangian manifold is a circle, its index 
is equal to 2, and the formula above reduces to the so-called “quantization 


condition” 
by bp dq = 2n(N + 4). 
y 


The eigenfunctions corresponding to these eigenvalues are also associated with lagrangian 
manifolds, but this association is not so simple. In fact, we cannot write down asymptotic 
formulas for eigenfunctions, but only for functions approximately satisfying the equations of 
characteristic functions. These functions turn out to be small outside the projection of the lagran- 
gian manifold onto the configuration space. The asymptotic formulas have singularities near 
the caustics formed by the projection. 

The actual eigenfunctions, however, can behave entirely differently, at least if the eigen- 
value is multiple or if there are eigenvalues close to it (cf. Appendix 10). 
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Lagrangian singularities are singularities of projections of lagrangian mani- 
folds onto configuration space. Such singularities are encountered in 
investigating global solutions to the Hamilton-Jacobi equation, in studying 
caustics, focal or conjugate points, in analyzing the propagation of dis- 
continuities and shock waves in the mechanics of a solid medium, and also in 
problems of short wave asymptotics (cf. Appendix 11). 

In order to describe lagrangian singularities we must first say a few words 
about singularities of smooth mappings in general. We begin with the 
simplest examples. 


A Singularities of smooth mappings of a surface onto a plane 


The mapping projecting a sphere onto a plane is singular on the equatorial 
circle (at points of the equator the rank of the derivative drops to one). As a 
result, a curve is formed on the plane of projection (the so-called apparent 
contour) bounding regions in which points have different numbers of pre- 
images: every point of the plane inside the apparent contour has two 
pre-images, and every point outside has none. 

In more complicated cases of “apparent contours” there can be more 
complicated singularities. Consider, for example, the surface given in three- 
dimensional space with coordinates (x, y, z) by the equation (Figure 245) 


x= yz—23 


and the mapping of projection parallel to the z-axis onto the plane with 
coordinates (x, y). 

The singular points of the projection form a smooth curve on the surface 
(with equation 3z” = y). However, the image of this curve on the (x, y) plane 
is not a smooth curve. This image is a semi-cubical parabola with a cusp at 
the point (0, 0) with equation 

27x? = 4y?. 

Such a curve divides the plane into two parts: a smaller part (inside the 
cusp) and a larger part (outside). Over each point of the smaller part there 
are three points of our surface, and over each point of the larger part there is 
only one. 

We now consider any small deformation of our surface. It turns out that, 
under projection of any surface close to ours, the apparent contour will 
always have a similar singularity (semi-cubical cusp) at some point close to 
the singularity of the apparent contour of the original surface. In other words, 
this singularity is not removable by a small perturbation of the surface. 

Furthermore, in place of a deformation of the surface, we can arbitrarily 
deform the mapping itself of the surface to the plane (no longer caring 
whether it is a projection), as long as it remains smooth and the deformation 
is small. It turns out that, for these deformations too, the cusp does not dis- 
appear but is only slightly deformed. 

The examples presented here exhaust all typical singularities of mappings 
of a surface to the plane. It can be shown that all more complicated singu- 
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Figure 245 Whitney’s tuck 


larities are removable by a small perturbation. Therefore, by slightly de- 
forming any smooth mapping, we can always arrange that in a neighborhood 
of any point of the surface, the mapping will be either nonsingular, or 
structurally similar to the projection mapping of a sphere onto a plane near 
the equator, or structurally similar to the projection mapping of the surface 
considered above with a cubic cusp on the apparent contour. 

The words “structurally similar to” mean that, on the pre-image surface 
and the image plane, we can choose local coordinates (in a neighborhood of 
our point and its image) such that in these coordinates the mapping will be 
written in a special way. Namely, the normal forms to which the mapping 
of the surface to the plane will be reduced in a neighborhood of points of the 
three types indicated above will be 


Vp=Hx V2 = Xp (nonsingular point) 
Yt =X y2=X (a fold, as on the equator of the sphere) 


Vy =X\X2—- XP = yy =X. ~~ (a “tuck” with a cusp on the apparent 
contour) 


Here (x,, x2) are the local coordinates in the pre-image, and (y,, y2) are the 
local coordinates in the image. 

The proof of this theorem (it is due to H. Whitney) and its multidimen- 
sional generalizations can be found in works on the theory of singularities of 
smooth maps, such as 


V. I. Arnold, Singularities of snooth mappings, Russian Math. Surveys 23:1 (1968) 1-44. 


Symposium on Singularities of Smooth Manifolds and Maps, Univ. of Liverpool, 1969-70. 
Proceedings. Springer, 1971. See especially the article of R. Thom and H. Levine. 


Golubitsky and Guillemin, Stable Mappings and Their Singularities, Springer-Verlag, 1973. 
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B Singularities of projection of lagrangian manifolds 


We now consider an n-dimensional configuration manifold, the correspond- 
ing 2n-dimensional phase space, and an n-dimensional lagrangian submani- 
fold (i.e., an n-dimensional submanifold on which the 2-form giving the 
symplectic structure of the phase space is identically zero). 

By projecting the lagrangian manifold onto the configuration space, we 
obtain a mapping of one smooth n-dimensional manifold to another. At most 
points, this mapping is a local diffeomorphism, but at some points of the 
lagrangian manifold the rank of the differential drops. These points are said 
to be singular. Under projection of the set of singular points to the configura- 
tion space an “apparent contour” is formed, which is called a caustic in the 
lagrangian case. 

Caustics can have complicated singularities; however, as in the usual 
theory of singularities of smooth maps, we can get rid of singularities which 
are too complicated by a small perturbation (here, by a small perturbation, 
we mean a small deformation of a lagrangian manifold in phase space under 
which this manifold remains lagrangian). 

After this there remain only the simplest unremovable singularities, for 
which we can write out normal forms and which we can study once and for all. 
When considering problems in general position which do not satisfy any 
special properties of symmetry, it is natural to expect that only these simple 
unremovable singularities will appear. 

Consider, for example, the caustics formed on a wall by light from a point 
source reflected from some smooth curved surface (here the four-dimensional 
phase space is formed by straight lines intersecting the surface of the wall in 
all possible directions, and the lagrangian submanifold by the rays of light 
coming from the source as they intersect the wall). By moving the source, we 
can see that generally the caustics have only simple singularities (semi- 
cubical cusps), while more complicated singularities appear only for special, 
exceptional positions of the source. 

We will give below, for n < 5, normal forms for singularities of the pro- 
jection of an n-dimensional lagrangian submanifold of 2n-dimensional phase 
space onto an n-dimensional configuration space. There are a finite number 
of these normal forms, and their classification is related (in a rather mysteri- 
ous way) with the classifications of simple Lie groups, simple degenerate 
critical points of functions, regular polyhedra, and many other objects. For 
n> 6, the normal forms of some singularities must inevitably contain 
parameters. For further details the reader is referred to the articles: 


V. 1. Arnold, Normal forms for functions near degenerate critical points, the Weyl groups of 
A,, D,, E,,and lagrangian singularities, Functional Analysis and Its Applications 6:4 (1972) 
254-272. 


V. I. Arnold, Critical points of smooth functions and their normal forms, Uspekhi Mat. Nauk 
30:5 (1975). 
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C Tables of normal forms of typical singularities 
of projections of lagrangian manifolds of 
dimension n < 5 


We will use the following notation: 
(41, ---> 4,) are coordinates on the configuration space, 
(P1,---, P,) are the corresponding impulses, 


so that p and q together form a symplectic coordinate system in the phase 
space. 

We will give a lagrangian manifold with the help of a generating function 
F by the formulas 


_ oF or 
qi = OD; Dj ~~ 0q;° 


where the index i runs over some subset of {1,..., n} and j runs over the re- 
mainder of {1,...,n}. That is,i = 1,j > 1 for singularities denoted in the list 
by A,, and i = 1, 2, j > 2 for singularities denoted by D, and E,. 

With this notation, one and the same expression F(p;, q;) can be con- 
sidered as giving a lagrangian manifold in spaces of a different number of 
dimensions: we can add arbitrarily many arguments q,, on which F does not 
actually depend. 

The list of normal forms of typical singularities is now as follows: for 
n=1 


A,:F = pi A,:F = +p?; 
for n = 2, in addition to the two above, there is 
A3:F = tpi + 42%; 
for n = 3, in addition to the three preceding, there are 
Ag: F = +pi + 43Pi + 420i; 
D4: F = +pip2 + p2 + 43P3; 
for n = 4, in addition to the five preceding, there are 
As: F = +pi + GaP + 43Pi + 42P15 
Ds: F = +pip2 + p3 + dap) + 4373; 
for n = 5, in addition to the seven preceding, there are 
Ag: F = tpi + 45pi + --* + 42? it, 
De: F = £P{P2 + P32 + 452 + GaP2 + 4372, 
Eo: F = +p} + 2 + 45P1P2 + 44PiP2 + 93P2- 
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D Discussion of the normal forms 


A point of type A, is nonsingular. A singularity of type A, is a fold singularity. 
If we take (p,, q2,.--, q,) aS coordinates on the lagrangian manifold, then 
the projection mapping may be written as 


(P15 das +++ In) > (£3P 3s Gas - +++ Gn) 


A singularity of type A; is a tuck with a semi-cubical cusp on the visible 
contour. To convince ourselves of this, it is enough to write out the cor- 
responding mapping of the two-dimensional lagrangian manifold to the 
plane: 


(Pi, 42) > (+4p? + 242P1; 42). 


A singularity of type A, first appears in the three-dimensional case, and 
the corresponding caustic is represented by a surface in three-dimensional 
space (Figure 246) with a singularity called a swallowtail (we already en- 
countered this in Section 46). 

The caustic of a singularity of type D, in three-dimensional space is 
represented as a surface with three cuspidal edges (of type A3), tangent at 
one point; two of these cuspidal edges can be imaginary, so that there are 
two versions of the caustic of D4. 


A3 


Figure 246 Typical singularities of caustics in three-dimensional space 


E Lagrangian equivalence 


We must now say in what sense the examples mentioned are normal forms of 
typical singularities of projections of lagrangian manifolds. First of all, we 
will define which singularities we will consider to have the “same structure.” 

A projection mapping of a lagrangian manifold onto configuration space 
will be called a lagrangian mapping for short. Suppose that we are given two 
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lagrangian mappings of manifolds of the same dimension n (the correspond- 
ing n-dimensional lagrangian manifolds lie, in general, in different phase 
spaces which are cotangent bundles of two different configuration spaces). We 
say that two such lagrangian mappings are lagrangian equivalent if there is a 
symplectic diffeomorphism of the first phase space to the second, taking 
fibers of the first cotangent bundle to fibers of the second, and taking the first 
lagrangian manifold to the second. The symplectic diffeomorphism itself is 
then called a lagrangian equivalence mapping. 

We note that two lagrangian equivalent lagrangian mappings are taken 
one to the other with the help of diffeomorphisms in the pre-image space and 
the image space (or, as they say in analysis, are carried to one another by a 
change of coordinates in the pre-image and in the image). In fact, our sym- 
plectic diffeomorphism restricted to the lagrangian manifold gives a diffeo- 
morphism of the pre-images; a diffeomorphism of the configuration-space 
images arises because fibers are carried to fibers. 

In particular, the caustics of the two lagrangian equivalent mappings are 
diffeomorphic, hence a classification up to lagrangian equivalence implies a 
classification of caustics. However, the classification up to lagrangian equiv- 
alence is finer than the classification of caustics, since a diffeomorphism of 
caustics does not in general give rise to a lagrangian equivalence of the map- 
pings. Furthermore, the classification up to lagrangian equivalence is finer 
then the classification up to diffeomorphisms of the pre-image and image, 
since not every such pair of diffeomorphisms is realized by a symplectic 
diffeomorphism of the phase space. 

A lagrangian mapping considered in a neighborhood of some chosen point 
is called lagrangian equivalent at that point to another lagrangian mapping 
(also with a chosen point), if there is a lagrangian equivalence of the first 
mapping in some neighborhood of the first point onto the second in some 
neighborhood of the second point, carrying the first point to the second. 

We can now formulate a classification theorem for singularities of 
lagrangian mappings in dimensions n < 5. 


Theorem. Every n-dimensional lagrangian manifold (n < 5) can, by an arbi- 
trarily small perturbation in the class of lagrangian manifolds, be made into 
one such that the projection mapping onto the configuration space will be 
lagrangian equivalent at every point to one of the lagrangian mappings in 
the list above. 


In particular, a two-dimensional lagrangian manifold can be put in 
“general position” by an arbitrarily small perturbation in the class of 
lagrangian manifolds, so that the projection mapping onto the configuration 
space (two-dimensional) will not have singularities other than folds (which 
can be reduced by a lagrangian equivalence to the normal form A,) or tucks 
(which can be reduced by a lagrangian equivalence to the normal form A,). 
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We note that this assertion about two-dimensional lagrangian mappings does not follow 
from the classification theorem for general (non-lagrangian) mappings. In the first place, 
lagrangian mappings make up a very restricted class among all smooth mappings, and therefore 
they can (and actually do for n > 2) have as typical, singularities which are not typical for 
mappings of general form. Secondly, the possibility of reducing a mapping to normal form by 
diffeomorphisms of the pre-image and image does not imply that this can be done using a 
lagrangian equivalence. 


In this way, the caustics of a two-dimensional lagrangian manifold in 
general position have as singularities only semi-cubical cusps (and points of 
transversal intersection). All more complicated singularities break up under 
a small perturbation of the lagrangian manifold, the resulting cusps and self- 
intersection points of caustics are unremovable by small perturbations, and 
are only slightly deformed. 

Normal forms of the singularities A,, D,,...can be used in a similar way 
for studying the caustics of lagrangian manifolds of higher dimensions, and 
also for studying the development of caustics of low-dimensional lagrangian 
manifolds, when parameters on which the manifold depends are varied.''° 


Other applications of the formulas of this section can be found in the theory of Legendre 
singularities, 1.e., singularities of wave fronts. Legendre transforms, envelopes, and convex hulls 
(cf. Appendix 4). The theories of lagrangian and Legendre singularities have direct application, 
not only in geometric optics and the theory of asymptotics of oscillating integrals, but also in 
the calculus of variations, in the theory of discontinuous solutions of nonlinear partial differential 
equations, in optimization problems, pursuit problems, etc. R. Thom has suggested the general 
name catastrophe theory for the theory of singularities, the theory of bifurcations, and their 
applications. 


116 See, e.g., V. Arnold, Evolution of wavefronts and equivariant Morse lemma, Comm. Pure 
Appl. Math., 1976, No. 6. 


452 


Appendix 13: The Korteweg—de Vries equation 


Not all first integrals of equations in classical mechanics are explained by 
obvious symmetries of a problem (examples are specific integrals of Kepler’s 
problem, the problem of geodesics on an ellipsoid, etc.). In such cases, we 
speak of “hidden symmetry.”!!” 

Interesting examples of such hidden symmetry are furnished by the 
Korteweg-de Vries equation 


(1) u, = 6uu, — U,,,- 


This nonlinear partial differential equation first arose in the theory of 
waves in shallow water; later it turned out that this equation is encountered 
in a whole series of problems in'‘mathematical physics. 

As a result of a series of numerical experiments, remarkable properties 
of solutions of this equation with zero boundary conditions at infinity were 
discovered: as t > 00 and t > —oo these solutions decompose into “soli- 
tons”—waves of definite form moving with different velocities. 


To obtain a soliton moving with velocity c, it is sufficient to substitute the function 
u = g(x — ct) into equation (1). Then we obtain the equation ge” = 39? + cg +d for @ 
(d is a parameter). This is Newton’s equation with a cubic potential. There is a saddle on the 
phase space (9, 9’). The separatrix going from this saddle to the saddle for which g = 0 de- 
termines a solution ¢ tending to 0 as x > +0; it is a soliton. 


When solitons collide, there is a complicated nonlinear interaction. 
However, numerical experiments showed that the sizes and velocities of the 
solitons do not change as a result of collision. And, in fact, Kruskal, Zabusky, 
Lax, Gardner, Green, and Miura succeeded in finding a whole series of first 
integrals for the Korteweg-de Vries equation. These integrals have the form 
I, = | P.(u, ..., u)dx, where P, is a polynomial. For example, it is easy to 
verify that the following are first integrals of equation (1): 


12 
1, = fudx Ip = fu? ax t= f(F +e ex 


u’? 5 5 
T _ ha ot at 2,," ah 4 q 
2 {(5 yuu +30 )ax 


The appearance of an infinite series of first integrals is easily explained by 
the following theorem of Lax.''® We will denote the operator of multiplica- 
tion by a function of x by the symbol for the function itself, and the operator 
of differentiation with respect to x by the symbol ¢. Consider the Sturm- 
Liouville operator L = —0? + u depending on a function u(x). We verify 
directly: 


Theorem. The Korteweg-de Vries equation (1) is equivalent to the equation 
ui = [L, A], where A = 40° — 3(ud + Ou). 


‘17 The term “accidental symmetry” is frequently used in English. [Trans. note. ] 


118 T ax, P. D., Integrals of nonlinear equations of evolution and solitary waves, Comm. Pure 
Appl. Math. 21 (1968) 467-490. 
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Directly from this theorem of Lax, we have 


Corollary. The operators L constructed from a solution of equation (1) are 
unitarily equivalent for all t; in particular, each of the eigenvalues A of the 
Sturm-Lionville problem Lf = Af with zero boundary conditions at infinity 
is a first integral of the Korteweg-de Vries equation. 


Gardner, V. E. Zakharov and L. D. Faddeev noted that equation (1) is a 
completely integrable infinite-dimensional hamiltonian system, and found 
the corresponding action-angle variables.1!° A symplectic structure on the 
space of functions vanishing at infinity is given by the skew-scalar product 
«?(dw, dv) = $f (w dv — v dw)dx, and the hamiltonian of equation (1) is the 
integral J,. In other words, equation (1) can be written in the form of Hamil- 
ton’s equation in the functional space of functions of x, u = (d/dx)(6I ,/du). 

Every integral I, gives in this way a “higher Korteweg-de Vries equation” 
u = Q,[u], where Q, = (d/dx)(6I,/Ou) is a polynomial in the derivatives 
u,u’,...,u?°*!, The integrals J, are in involution, and the flows corresponding 
to them on the functional space commute. 


The explicit form of the polynomials P, and Q,, and also the explicit form of the action- 
angle variables (and therefore of solutions of equation (1)), is described in terms of solutions of 
the direct and inverse problems of scattering theory with potential wu. 

The explicit form of the polynomials Q, can also be obtained from the following theorem of 
Gardner, generalizing Lax’s theorem. In the space of functions of x, we consider a differential 
operator of the form A = ¥ p,6"~', where po = 1, and the remaining coefficients p; are poly- 
nomials in u and the derivatives of u with respect to x. It turns out that, for any s there is 
an operator A, of order 2s + 1 such that its commutator with the Sturm-Liouville operator L 
is the operator of multiplication by a function [L, A,] = Q,. 

The operator A, is defined by these conditions uniquely up to the addition of linear combina- 
tions of the A, withr < s;in the same way, the polynomials Q, are determined up to the addition 
of linear combinations of the preceding Q,’s. 


V. E. Zakharov, A. B. Shabat, L. D. Faddeev, and others, using Lax’s 
method and techniques of inverse scattering theory, have studied a whole 
series of physically important equations, including the equations u,, — u,,. = 
sin u and iW, + W,, tWlv/? = 0. 

Investigation of the problem with periodic boundary conditions for the 
Korteweg-de Vries equation led S. P. Novikov'”® to the discovery of an 
interesting class of completely integrable systems with a finite number of 
degrees of freedom. These systems are constructed in the following way. 

Consider any finite linear combination of first integrals, I = )° c,1,-;, 
and let co = 1. The set of stationary points of the flow with hamiltonian | 


119 Zakharov, V. E. and Faddeev, L. D., The Korteweg-de Vries equation is a completely 
integrable hamiltonian system, Functional Analysis and Its Applications, 5:4 (1971) 280-287. 


120 Novikov, S. P., The periodic problem for the Korteweg-de Vries equation, Functional 
Analysis and Its Applications, 8:3 (1974) 236-246. 
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on the functional space is invariant under the phase flows with hamiltonians 
I,, including the phase flow of equation (1). 

On the other hand, these stationary points are determined from the 
equations (d/dx)(6I/du) = 0, or 6I/du = d. The second equation is the 
Euler-Lagrange equation for the functional I — dI_,, involving derivatives 
of order n. Therefore, it has order 2n and can be written as a hamiltonian 
system of equations in 2n-dimensional euclidean space. 

It turns out that this hamiltonian system with n degrees of freedom has n 
integrals in involution and can be integrated completely with the help of 
suitable action-angle coordinates. In this way, we obtain a finite-dimensional 
family of particular solutions of the Korteweg-de Vries equation depending 
on 3n + 1 parameters (2n phase coordinates and n + 1 further parameters 
Cipieny C55 a): 

These solutions have, as Novikov showed, remarkable properties; for 
example, in the periodic problem they give functions u(x) for which the linear 
differential equation with periodic coefficients 


—X" + u(x)X = 1X 


has a finite number of zones of parametric resonance (cf. Section 25) on the 
A-axis. 


After this book was written, much work was done on the subjects dis- 
cussed in this appendix, in particular by Novikov, Dubrovin, Krichever, 
Manakov, Matveev, Its, Dikii, Manin, Drinfeld, Gelfand, Lax, Moser, 
McKean, Van Moerbeke, Adler, Perelomov, Olshanetskii, and many others. 
Among other things, Manakov solved the Euler equations of a rigid body in 
R" for arbitrary n: these are completely integrable. For more details see the 
forthcoming book by Novikov and his collaborators. (Note added by author 
in translation.) 
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Along with the classical Poisson bracket of functions, one also encounters 
more general (degenerate) brackets. A typical example is the Poisson bracket 
of functions of the components M, of the angular momentum vector: 
{F, G} = )\(GF/0M,)(@G/OM,){M;, Mj}. Such degenerate brackets may be 
considered as families of ordinary Poisson brackets or families of sympletic 
manifolds. These families generally have singularities (they are not foliations): 
they consist of symplectic manifolds (leaves) of different dimensions, related 
to one another by the condition of smoothness for the given degenerate 
Poisson bracket structure on the ambient space. (In the angular momentum 
example above, the leaves are concentric spheres and their center at the 
origin.) 

In this appendix, we shall present the simplest elementary properties of 
Poisson structures on finite-dimensional manifolds. One should keep in mind, 
though, that in applications (especially to the mathematical physics of con- 
tinuous media) one frequently encounters Poisson structures on infinite- 
dimensional manifolds. In these cases, the symplectic leaves often (but not 
always) have finite dimension or codimension. 


A Poisson manifolds 


A Poisson structure on a manifold is a Lie algebra structure on its space 
of smooth functions (i-e., a bilinear skew-symmetric operation of “Poisson 
bracket” on functions, satisfying the Jacobi identity) such that the operator 
ad, = {a, } (contraction of the Poisson bracket with any fixed function a) is 
an operator of differentiation by some vector field 6,. The vector field 6, is then 
called the hamiltonian vector field with hamiltonian function a. The mapping 
at> 96, gives a homomorphism from the Lie algebra of functions to the Lie 
algebra of vector fields. A manifold with a given Poisson structure is called a 
Poisson manifold. 

Two points ona Poisson manifold are called equivalent if they can be joined 
by a path consisting of segments of integral curves of hamiltonian vector fields. 
The equivalence classes under this relation are called the leaves of the Poisson 
manifold. The values of all possible hamiltonian vector fields at a given point 
of a Poisson manifold form a linear space which is just the tangent space of 
the leaf through that point. Thus the leaves are smooth manifolds, but they 
are in general not closed, and they have different dimensions. 

The classical (explicitly described by S. Lie in 1890, but essentially con- 
sidered already by Jacobi) example of a Poisson manifold is the dual space of 
a (finite-dimensional) Lie algebra. The elements of the algebra itself may be 
considered as linear functions on this space. The Poisson structure is defined 
as an extension of the Lie algebra structure from this finite-dimensional sub- 
space to the entire space of smooth functions on the dual of the original Lie 
algebra. Such an extension exists and is unique: if w,,..., @, is a basis of the 
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original Lie algebra, then 
{a, D} poisson os ¥ (6a/dq;)(0b/6w;) [w;, Oj Jrie- 


In this example, the leaves are the orbits of the co-adjoint representation of 
the underlying Lie group in the dual of its Lie algebra. 

Every leaf of a Poisson manifold carries a natural symplectic structure 
(closed nondegenerate 2-form), defined in the following way. Consider the 
values of two hamiltonian vector fields at a point of the leaf. The value of the 
2-form on this pair of vectors is defined to be the value of the Poisson bracket 
of the hamiltonian functions at the given point (this value depends only on the 
two vectors and not on the choice of hamiltonian functions). The fact that the 
form is closed on the leaf follows from the Jacobi identity; nondegeneracy 
comes from the fact that, if the derivative of every function by a given tangent 
vector is zero, then the vector itself must be zero. The phase flow of every 
hamiltonian vector field preserves the symplectic structures on the leaves. 

Thus, the leaves of a Poisson manifold are even dimensional, and the 
manifold may be considered as a union of sympletic manifolds (generally of 
different dimensions), whose symplectic structures are coordinated by the 
condition that the Poisson bracket on the ambient space be smooth. 

For example, the co-adjoint orbits of SO(3) (spheres centered at the origin) 
may be organized according to local Darboux coordinates: in the neighbor- 
hood of any nonzero point, the Poisson structure in suitable local coordinates 
takes the form {x, y} = 1, {x,z} = {y,z} = 0. This normal form for the Poisson 
structure on the space of angular momenta is convenient in carrying out the 
process of elimination of the nodes in the many-body problem (see Section 
III.5.5 of the paper: V. I. Arnol’d, Small denominators and problems of 
stability of motion in classical and celestial mechanics, Russian Math. Surveys 
18, No. 6 (1963), 85-191). 

Jacobi realized that the (classical) Poisson brackets of the first integrals 
of any hamiltonian system could be considered as a Poisson structure (this 
structure is discussed in Section VI.1.3 of the author’s paper cited above). 

The construction of a Poisson structure on the dual space of a Lie algebra 
leads to a new Lie algebra. This construction may then be repeated, leading 
to a whole series of new (infinite-dimensional) Poisson structures. More gen- 
erally, suppose that one is given any Poisson structure on a manifold. Then 
the space of functions on that manifold carries the structure of a Lie algebra. 
This implies that the dual space of this function space carries its own Poisson 
structure. Elements of this dual space may be interpreted as distribution den- 
sities on the original manifold. Thus, the space of distributions on a Poisson 
manifold (for example, on a symplectic phase space) has a natural Poisson 
structure. This structure makes it possible to apply the hamiltonian formalism 
to equations of Vlasov type, which describe the evolution of distributions of 
particles in phase space under the action of a field which is consistent with the 
particles themselves. 
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B Poisson mappings 


A mapping from one Poisson manifold to another is called a Poisson mapping 
if it is consistent with the Poisson structures, i.e., if for any two functions 
on the second manifold, the Poisson bracket of their pullbacks to the first 
manifold coincides with the pullback of their Poisson brackets. For example, 
the embedding of each symplectic leaf in a Poisson manifold is a Poisson 
mapping. 

The cartesian product of two Poisson manifolds has a natural Poisson 
structure, for which the projection on each factor is a Poisson mapping (the 
Poisson bracket of functions pulled back from different factors is zero). 

S. Lie showed that every Poisson manifold is locally (in the neighborhood 
of a point where the dimension of the symplectic leaves is locally constant, for 
example, in the neighborhood of a generic point, where the rank is locally 
maximal) decomposible into the product of a symplectic leaf and a comple- 
mentary space on which all Poisson brackets are zero. 

On such a neighborhood, one may introduce coordinates p,, q;, c; such that 
p and q have the usual symplectic Poisson brackets, while the Poisson bracket 
of each c; with any function is equal to zero. In physics, the coordinates p; and 
q; are called Clebsch variables,'?! while the c;s are called Casimir functions. 
Clebsch introduced his variables for the hamiltonian description of the hydro- 
dynamics of ideal fluids, while Casimir considered the center of the Lie algebra 
of functions on the dual space of a given Lie algebra. 

The dimension of the symplectic leaf through a nongeneric point of a 
Poisson manifold is less than that for nearby generic points. In the neighbor- 
hood of such a point, the Poisson manifold may still be represented as the 
product of a neighborhood of the point in its symplectic leaf and a neighbor- 
hood of a distinguished point in some Poisson manifold of complementary 
dimension. In other words, on a minimal transverse manifold to a symplectic 
leaf there arises a (unique up to diffeomorphism) local Poisson structure—the 
so-called transverse Poisson structure (cf. A. Weinstein, The local structure of 
Poisson manifolds, J. Diff. Geom. 18 (1983), 523-557).'?* In the transverse 
structure, the Poisson brackets of all functions are zero at the distinguished 
point (which may be taken as the origin of a coordinate system). The Taylor 
series for these brackets begin with 


(xi 4} = Deep to 


121 Translator’s note: The term Clebsch variables is also used to refer to canonical coordinates 
on a symplectic manifold which projects onto (rather than embeds into) a Poisson manifold. 


122 Warning: As A. B. Givental’ has noted, Theorem 3.1 in this paper is incorrect. (Translator’s 
note: For further discussion, see A. Weinstein, Lie algebras and Poisson structures, Astérisque, 
hors série (1985), 257-271.) 
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where cj}, are the structure constants of a finite-dimensional Lie algebra (the 
linearized transverse structure). 

A natural question arises: Is it possible to annihilate the higher order terms 
in the Taylor series by a suitable change of coordinates? 

The question of the form of transverse structures was already raised by the 
author in Section VI.1.3 of the previously cited article. 

If the linearized algebra is semisimple and the Poisson structure is analytic, 
then one can eliminate the higher order terms of the Taylor series by an 
analytic change of coordinates: J. Conn, Linearization of analytic Poisson 
structures, Annals of Math. 119 (1984), 577-601. An analogous result is true 
for the C® case, when the linearized algebra is of compact type: J. Conn, 
Linearization of C® Poisson structures, Annals of Math. (1985). 

A. Weinstein, along with his earlier proof of an analogous result for formal 
series, expressed the conjecture that semisimplicity was a necessary condition 
for the annihilation of nonlinear terms. The study of singularities of Poisson 
structures in the plane (or, more generally, structures with symplectic leaves 
of codimension 2) leads, however, to a different conclusion. 


C Poisson structures in the plane 


From the point of view of differential geometry, a Poisson structure is given 
by a smooth bivector field on a manifold. In fact, the Poisson brackets at each 
point associate a number to each pair of cotangent vectors. Therefore they 
define a section of the second exterior power of the tangent bundle, ie., a 
bivector field. 

The Jacobi identity expresses a sort of “closedness” of this bivector field. 
On a two-dimensional manifold, this closedness condition is automatically 
satisfied everywhere, so that every smooth bivector field on the plane gives a 
Poisson structure. This circumstance allows one to apply to the classification 
of Poisson structures in the plane the usual considerations of general position 
(transversality, etc.). In terms of coordinates x, y, a bivector field may be 
expressed in the form f(é, A 0,), where f is a smooth function. The corre- 
sponding Poisson structure is defined by the condition 


(1) {xy} = f% y). 


A Poisson structure on the plane may also be given by a differential 2-form 
dx a dy/f. This form, like the bivector field, is invariantly connected with the 
Poisson structure; however, unlike the bivector field, it has pole singularities 
along the curve f = 0. The leaves in this case are the points of the curve f = 0 
and the connected components of the complement of this curve in the plane. 
Points of the curve f = 0 are called singular points of the Poisson structure. 
In the neighborhood of a nonsingular point, any Poisson structure in the plane 
may be put into the normal form {x, y} = 1. 

The following diagram shows the beginning of the hierarchy of singularities 
of Poisson structures on the plane in the neighborhood of a singular point. 
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Ag <— AU <— Az — AG <— Ay — AS — Ag <— AG <_A 


a,b a a,b a a,b 
Dg” <— D3 — De, «— Dz <— Dg 


Eg E> <— Eg. 


Each letter in the diagram represents a Poisson structure which, in suitable 
local coordinates with origin at the singular point under consideration, can 
be written in the form {x, y} = f, where the function f is given by Table 1. 


Table 1 

Ao Ar. kA Diy 

: ain a x? + yk x2y + yet 

1+ ay? 1+ ax + by" 

Dias Eg ES Eg 

x2y 4 y2h x34 xy? 

aap aes age ane aati 
1+ ax 1 + ay? e 


Theorem. Given a Poisson structure on a two-dimensional manifold, it is either 
reducible in a neighborhood of each point to one of the normal forms in Table 
1, or it belongs to a set of codimension 8 in the space of Poisson structures. 


Thus, a generic Poisson structure may be reduced in a neighborhood of 
each point to the normal form {x, y} = 1 (nonsingular point) or {x, y} = y 
(point of type A). In a generic one-parameter family, one encounters for 
special values of the parameter structures of the type A,: {x, y} = b(x? + y’), 
b # 0; in two-parameter families one finds A), etc. 


Remark 1. In the two-dimensional case, the set of all Poisson structures 
forms a linear space, so that one may speak of a generic structure or family of 
structures (having in mind a structure [family] belonging to some open dense 
subset of the space of structures [families]). The problem of classifying generic 
Poisson structures in three or more dimensions is not uniquely posed, since 
the set of all such structures does not form a single manifold (one may find 
components of “different dimensions,” as in the classification of Lie algebras). 


Remark 2. The structure {x, y} = y of type Ag is the standard Poisson 
structure on the dual space of the Lie algebra of the group of affine transforma- 
tions of the line. This structure was considered in 1965, in connection with the 
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study of the Euler equations for left-invariant metrics on groups (in this 
case—the Lobachevskii metric on a half-plane), at which time it was already 
realized that the structure is stable and is locally equivalent to any structure 
of the form {x, y} = y + ---, where the dots designate higher order terms. This 
(evident) observation contradicts the previously mentioned conjecture of A. 
Weinstein, according to which the possibility of removing any higher order 
terms by a formal change of coordinates was characteristic of the linear 
Poisson structures on the dual spaces of semisimple Lie algebras. 


Remark 3. The parameters a, b in the table above are moduli (invariants 
depending continuously on the structure). More precisely, structures equivalent 
to a given one are found only a finite number of times as the parameters are 
varied. 

The rational functions in Table 1 may be replaced by polynomials, but it 
is not very convenient to do so. The number of moduli in the numerator is 
one less than the number of irreducible components of the curve f = 0. This is 
not merely a coincidence. One invariant of a Poisson structure on the plane 
is the residue constructed from the form dx A dy/f (initially, one constructs 
a residue-form on each component, then its residue at the origin). The sum of 
the residues corresponding to all the components is zero. Therefore the 
number of moduli is 1 less than the number of components. 


D Powers of volume forms 


The classification of Poisson structures on the plane may be considered as 
the classification of differential forms of the type f(dx ~ dy)~', where f is 
a smooth (or holomorphic) function. More generally, it is natural to consider 
forms of the type 


(2) S(dx)* = f(x,,...,x,)(dx, Act A dx,,)*, 


where « is a fixed number, generally complex. The classification of such forms 
and their deformations in the one-dimensional case, recently carried out by 
V. P. Kostov, revealed the role of resonance values of « (certain negative 
rational numbers). 

For example, the resonance case n = 1, « = —1 corresponds to the classi- 
fication of the singularities and their bifurcations for vector fields on the line, 
i.e., singular points of differential equations X = v(x) and their bifurcations in 
finite-parameter families. A generic one-parameter family may be reduced by 
a smooth (holomorphic) change of the parameter and a smooth (holomorphic) 
change of the variable x, depending smoothly (holomorphically) on the 
parameter, to the form X = x? + ¢ + c(e)x?. (For k parameters, the corre- 
sponding form is X = x**1 + ¢,x*") +--+ + & + c(e)x7#*1) 

The nonresonance case was studied by S. Lando for all n and «: he showed 
that almost every versal deformation of the function f defines, after multiplica- 
tion by (dx)*, a versal deformation of the form, as long as « is not a resonance 
value. 


461 


Appendix 14: Poisson structures 


The case « = — 1, which is interesting in connection with Poisson structures, 
is generally a resonance case. Instead of powers of volume forms, as in (2), 
we may consider the differential forms 


(3) fPdx, B=1/a, 


whose classification is obviously equivalent. 

The hypersurface f = 0 is invariantly connected with the form (3). The 
classification therefore begins with the reduction to normal form of the singu- 
larity manifold f = 0. The beginning of the hierarchy of singular points of 
hypersurfaces is known. In suitable local coordinates, a hypersurface is given 
by one of the equations in the following list: 


Expt txgte tx =0, we; 
x?x,t xe btx2te- tx? =0, p> 
Eg. xi +x$tx34+-°+x2 =0; 
Ex. xi +x,x2t+x2+-°+x? =0; 
Eg: x2 +xf4x34°°42x2=0. 
After we have brought the hypersurface into normal form, the classification 
of the forms (2) or (3) comes down to classifying forms of the type 
(4) SPR(%1,...5Xq) AX, h(O) # 0, 


where f = 0 is the given equation of the singularity hypersurface and h is 
a smooth (holomorphic) function which remains to be put in normal form. 


E The quasi-homogeneous case 


We shall consider here the case in which the singularity hypersurface f = 0 is 
quasi-homogeneous (this condition holds for the cases A, D, E). 


Definition. A function f is called quasi-homogeneous of weight p, with weights 
w; attached to the variables x,, if it is an eigenfunction with eigenvalue p for 
the quasi-homogeneous Euler vector field ¢ (or is zero): 


ef = pf, where e = )' w;x;,(0/0x;). 


A quasi-homogeneous polynomial is called nondegenerate if the critical point 
0 has finite multiplicity (i.e., it is C isolated). From here on, we will take the 
weights w; to be positive numbers. 


Theorem. Let f be a nondegenerate quasi-homogeneous polynomial of weight 1. 
Then the differential form f ®h dx (where dx =dx, A: A dx, and h is a holo- 
morphic function on a neighborhood of 0) may be reduced by a biholomorphic 
coordinate change in a neighborhood of zero to the form f*(1 + ¢) dx, where 
¢ is a quasi-homogeneous polynomial of weight —B — 0,0 = wy +°** + Wy. 
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The weight of ¢ is chosen so that the weight of the form {%¢ dx is zero. 
An analogous theorem is true for smooth h (and smooth coordinate changes), 
except that in the real case one must replace 1 + ¢ by +1+ ¢. 


EXAMPLE 1. If 8 is positive, then ¢ = 0, so that the complex form reduces 
to f? dx. 

More generally, ¢ = 0 if the (possibly complex) number f is not a negative 
rational number: in this case, a nonzero quasi-homogeneous polynomial of 
weight —f—o does not appear. If the polynomial f (or just its quasi- 
homogeneity type w) is fixed, then the resonance values of 8 form a finite set 
of arithmetic progressions in the negative rationals (for the remaining B, 
fPh dx reduces to the form f? dx). 


EXAMPLE 2. If 8 = — 1, then the monomials occurring in ¢ may be enumerated 
by the interior integral points of the Newton diagram of f. The monomial 
x™ = xf... xf" corresponds to the point (m, + 1,...,m, + 1) of the diagram 
(i.e., the exponent of the form x” dx). 


EXAMPLE 3. Suppose that 8 = —1,n = 3, and fis one of the A, D, E polynomials 
introduced above, defining a simple singularity. Calculating weights, we find 
that — 6 — o < 0; therefore ¢ = 0, from which we obtain: 


Corollary 1. The form with pole singularity 


h(x, y, z) dx a dy a dz 
f(x, y, 2) 


where f is one of the polynomials A, D, E, may be reduced to the form 
dx « dy n dz/f by a holomorphic (smooth) change of coordinates. 


h(0) # 0, 


In exactly the same way for any n > 3, a factor h(x,,...,x,,) which does not 
vanish at the origin can be converted to unity. 


Corollary 2. A simple form (i.e., one not having moduli) of the type dx, A °°: A 
dx, /f(X15-++,X_), where f is a holomorphic (smooth) function near the origin 
and n > 2, may be reduced by a coordinate change in a neighborhood of the 
origin to anormal form in which f is either 1 or one of the A, D, E polynomials. 


Corollary 3. A simple (not having moduli) n-vector field in n-dimensional space 
(n > 2) is locally equivalent to a normal form f:(0, \ ++: A 6,), where f is 
either 1 or one of the A, D, E polynomials; 0, = 6/0x;,. 


Corollary 4. For | < 6, in generic |-parameter families of n-vector fields on 
n-dimensional space (n > 2), the field in a neighborhood of each point and for 
each value of the parameters is equivalent to one of the simple fields in the 
preceding corollary. 


463 


Appendix 14: Poisson structures 


Corollary 5. For | < 6, in generic I-parameter families of forms dx ~ dy A dz/ 
L(x, y, 2), one finds only forms which in the neighborhood of each point are 
locally equivalent to one of the following 24 types: 


dx a dy a dz dx A dy A dz dx A dy A dz dx a dy a dz 
I * Key ak ee xy ee” 


dx a dy A dz dx a dy A dz dx a dy a dz dx a dy a dz 
xt ye tz’ ety tz22’ xytyp+z’ xotytz2’ 


dx a dy n dz dx A dy a dz dx a dy n dz dx A dy A dz 
x?y + yt +27’ x? + y? +27? x?y + po +27’ x? + y* + 27° 


For n = 2 and f = —1, the theorem may be applied in the following way. 


Corollary 6. Let f be a nondegenerate quasi-homogeneous polynomial of weight 
1 with argument weights w,, w,. Then the form 


h(x, y) dx a dy 
fy” 
where h is a smooth (holomorphic) function in a neighborhood of 0, can be 
reduced by a suitable smooth (holomorphic) coordinate change to a form in 


which h = +1 + ¢, where @ is a quasi-homogeneous polynomial of weight 
1—w, — Wy. 


h(0, 0) 4 0, 


Correspondingly, bivector fields and Poisson structures may be locally 
reduced to the form 


fle MAG) — py Flos) 
+14 6%)” S414 dy) 


Calculating the weights of the simple singularity types A, D, E for functions 
of two variables, we obtain Table 1 from the last corollary. For example, for 
A, we have w, = w, = 3, the weight of ¢ equals 0, and so ¢ is constant. 

The dimension of the space of equivalence classes of forms hdx a dy/f, 
where h(0) # 0 and f is a fixed nondegenerate quasi-homogeneous polynomial, 
equals the dimension of the space of quasi-homogeneous polynomials of weight o. 


F Varchenko’s theorem 


A. N. Varchenko has proven a series of generalizations of the preceding 
theorem. Here we shall describe the simplest of these. 

1. Let f be a quasi-homogeneous polynomial of weight 1 in the variables 
X1,-++,X, With weights w,,..., w,. Suppose that, for some set J of multi-indices, 
the residue classes of the monomials x! generate (as a vector space) the factor 
algebra of the algebra of formal power series 


CED%1,.--. Xn] IMOf/0x,,..., 6f/0x,). 
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Theorem. Every germ f*h dx is equivalent to a germ of the form f*(1 + 
VAm ix!) dx, where the I’s are nonnegative integers and the m’s are ele- 
ments of I such that the weight of each form f*x™f' dx is equal to zero. 


2. We define the degree of non-quasi-homogeneity of the germ f to be the 
dimension of the factor space (f, 0f/0x,,..., Of/0x,)/(0f/0x,,..., Of/0X,). 


Theorem. For almost all B, the number of moduli of the form f*h dx, A -:* \ dX, 
(for fixed B and f and arbitrary h, h(0) #0) is equal to the degree of 
non-quasi-homogeneity of the germ f. The exceptional (resonance) values of 
B consist of a finite number of arithmetic progressions of negative rational 
numbers, with difference —1. In particular, for any B > 0, the number of 
moduli equals the degree of non-quasi-homogeneity. 


3. EXAMPLE. For f = 0, we obtain: 


Corollary. The number of moduli of the form h dx (h(O) # 0), relative to the 
group of diffeomorphisms preserving the germ of f, equals the degree of 
non-quasi-homogeneity of f (equal to zero, if the germ of f is equivalent to a 
quasi-homogeneous one). 


4. In the resonance cases, the result is more complicated. 
EXAMPLE. Let n = 2, f = —1 (Poisson structures in the plane). 


Theorem. The number of moduli for a germ of a Poisson structure with given 
singular curve f = 0 equals the degree of non-quasi-homogeneity of the germ 
of f augmented by one less than the number of irreducible components of the 
germ of the curve f = 0. 


In resonance cases, the number of moduli behaves in a rather regular 
way along each arithmetic progression with difference —1. Namely, when f 
decreases by 1 the number of moduli increases (not necessarily strictly), but 
its maximal value does not exceed (for any 8B > —n) the “nonresonant” value 
(i.e., the degree of non-quasi-homogeneity of f) by more than the number 
of Jordan blocks associated with the eigenvalue e?"” of the monodromy 
operator of the function f. 


G Poisson structures and period mappings 


An interesting source of Poisson structures is provided by the period mappings 
of critical points of holomorphic functions (A. N. Varchenko and A. B. 
Givental’, Mapping of periods and intersection form, Funct. Anal. Appl. 16, 
(1982), 83-93). 

Period mappings allow one to transfer to the base of a fibre bundle certain 
structures which live on the (co)homology spaces of the fibres. A Poisson 


465 


Appendix 14: Poisson structures 


structure on the base arises in this way from the intersection form in the 
middle-dimensional homology of the fibres, when this form is skew-symmetric. 

Period mappings are defined by the following construction. Suppose that 
one is given a locally trivial fibration. Associated to such a fibration are the 
bundles (over the same base) of homology and cohomology of the fibres with 
complex coefficients. These bundles are not only locally trivial, but they are 
locally trivialized in a canonical way (the integer cycles in a fibre are uniquely 
identifiable with integer cycles in the nearby homology fibres). A period 
mapping is defined as a section of the cohomology bundle. 

Suppose now that one is given, on the total space of a differentiable fibre 
bundle, a differential form which is closed on each fibre. The period mapping 
of this form associates to each point of the base the cohomology class of the 
form on the fibre over this point. 

If one is given a vector field on the base of the fibration, then any (smooth) 
period mapping may be differentiated along this vector field, and the derivative 
is again a period mapping. In fact, neighboring fibres of the cohomology 
bundle are identified with one another by the above-mentioned “integer” local 
trivialization, so a section may be considered (locally) as a map into one fibre 
and may be differentiated as an ordinary (vector-valued) function. 

Suppose now that the base is a complex manifold having the same complex 
dimension as the fibres of the cohomology bundle. A period mapping is called 
nondegenerate if its derivatives along any C-independent vectors at each 
point are linearly independent. In other words, a period mapping is non- 
degenerate if the corresponding local maps from the base to typical fibres are 
diffeomorphisms. 

The derivative of a nondegenerate period mapping thus allows us to map 
the tangent bundle of the base isomorphically onto the cohomology bundle. 
The dual isomorphism goes from the homology bundle to the cotangent 
bundle of the base. This isomorphism transfers to the base any additional 
structures carried by the homology groups. 

Suppose that the fibres of our original bundle are (real) oriented even 
dimensional manifolds, and consider their homology in the middle dimension. 
In this case, the homology of each fibre carries a bilinear form: the index of 
intersection. This form is symmetric if the dimension of the fibre is a multiple 
of 4; otherwise, it is skew-symmetric. The form is nondegenerate if the fibre is 
closed (i.e., compact and without boundary); otherwise, it may be degenerate. 
We shall suppose below that we are in the situation where the form is 
skew-symmetric. 

In this situation a nondegenerate period mapping induces a Poisson structure 
on the base. In fact, the isomorphism described above, between the cotangent 
spaces of the base and the homology groups of the fibres (carrying their 
skew-symmetric intersection forms), defines a skew-symmetric bilinear form 
on pairs of cotangent vectors. The Poisson bracket of two functions on the 
base is defined as the value of this form on the differentials of the functions. 

This bracket defines a Poisson structure (of constant rank) on the base. 
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Figure 247 Poisson structure and the swallowtail 


This is obvious from the fact that the local identification of the base with 
the cohomology of the typical fibre, given by the period mapping, provides the 
base with local coordinates whose Poisson brackets are constant.1?% 

Varchenko and Givental’ observed that if one constructs, in the way just 
described, using a generic 1-form, a Poisson structure on the complement of 
the discriminant locus in the base of a versal deformation of a critical point 
of a function of two variables, then this structure may be holomorphically 
extended across the discriminant locus. (One may replace the discriminant 
locus above by the wave front of a typical singularity.) We shall limit ourselves 
here to the simplest examples of Poisson structures arising in this way. 

Consider the three-dimensional space of polynomials C? = {x* + A,x? + 
A,x + 43} with coordinates 4,. The polynomials with multiple roots form 
therein the discriminant surface (a swallowtail; see Figure 247). 

The Poisson structures arising from period mappings may be reduced 
(by diffeomorphisms preserving the swallowtail) to the following form: the 
symplectic leaves are the planes 1, = const., and their symplectic structures 
are of the form d4, A dd3. 

The fibration of interest here is formed by the complex curves {(x, y): 
y? =x*+4,x? + 4,x + 15}, and the period mapping is given by, for example, 
the form y dx. (See V. I. Arnold, A. N. Varchenko, S. M. Gusein-Zade, 
“Singularities of Differentiable Mappings,” Vol. 2: Monodromy and the 
Asymptotics of Integrals, Birkhauser, 1988, §15, or Uspekhi Mat. Nauk 40, 
no. 5 (1985).) 


123 In the case where the intersection form is symmetric, the analogous construction defines on 
the base a flat pseudo-riemannian (possibly degenerate) metric. 
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The Poisson structures on the swallowtail space which arise from period 
mappings may be characterized locally among all generic structures by the 
following property: the line of self-intersections of the tail lies entirely in one 
symplectic leaf. The required genericity condition is that the tangent planes at 
the origin to the symplectic leaf and the swallowtail do not coincide. Every 
smooth function which is constant along the line of self-intersections of the 
tail, and whose derivative along the symplectic leaf at the origin is nonzero, 
may be reduced in a neighborhood of the origin, by a diffeomorphism preserv- 
ing the tail, to the form 2, + const.; also, a family of holomorphic symplectic 
structures in the planes 4, = const. may be reduced to the form dd, a dd, 
by a holomorphic local diffeomorphism of three-dimensional space which 
preserves the swallowtail as well as the foliation by the planes. 

One may conjecture more generally that those Poisson (in particular, 
symplectic) structures on the base of a versal deformation of a singularity, in- 
duced from the intersection form by an infinitesimally stable period mapping, 
may be characterized (up to diffemorphisms preserving the bifurcation set) by 
a natural condition on the rank of the restricted Poisson structure to the strata 
of the discriminant locus. The “natural condition” in the three-dimensional 
example above is that the line of self-intersections of the swallowtail be 
contained in a symplectic leaf. In four-dimensional space, an analogous role 
would apparently be played by the condition that a certain submanifold be 
lagrangian, namely, the manifold of polynomials having two critical points 
with critical value zero in the symplectic space of polynomials x> + 4,x? + 
A,x? + A3x + 4, (the ranks of the symplectic structure on the tangent spaces 
to the other strata may also be important). 
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A system of Jacobi’s elliptic coordinates is associated to each ellipsoid in 
euclidean space. These coordinates make it possible to integrate the equations 
of geodesics on the given ellipsoid, as well as certain other equations, such as 
the equations of motion for a point on a sphere under the influence of a force 
with quadratic potential, or for a point on a paraboloid under the influence 
of a uniform gravitational field. 

These facts suggest that, even on an infinite-dimensional Hilbert space, 
there should be a class of integrable systems associated to each symmetric 
operator. To study these systems, it is necessary to extend the theory of elliptic 
coordinates to the infinite-dimensional case. To do this, it is first necessary to 
express the finite-dimensional theory of confocal quadric surfaces in coordinate 
free form. 

In the transition to the infinite-dimensional case, symmetric operators on 
finite-dimensional euclidean spaces must be replaced by self-adjoint operators 
on Hilbert spaces. Since the elliptic coordinates are not really connected with 
the operator itself, but rather with its resolvent, the unboundedness of the 
original operator (which might be, for example, a differential operator) does 
not present a serious obstacle. 

In some cases, the elliptic coordinates on Hilbert space obtained from a 
self-adjoint operator form a countable sequence; however, when the operator 
has a continuous spectrum, the coordinates form a continuous family. In this 
case, the transformation from the original point of the Hilbert space (thought 
of as a function space) to the continuous family of elliptic coordinates of the 
point may be considered as a nonlinear mapping between function spaces. 
This mapping, by analogy with the Fourier transform, might be called the 
Jacobi transform: the original function is transformed into a function which 
expresses the elliptic coordinates in terms of some continuous “index.” (More 
precisely, the result of the transform is a measure on the spectral parameter 
axis.) The study of the functional analytic properties and the inversion of the 
Jacobi transform will probably be accomplished before too long. 

Following an exposition of the general theory of elliptic coordinates, we 
shall describe below some of the applications of these coordinates to potential 
theory. 

This appendix is based on the following papers by the author. 


Some remarks on elliptic coordinates, Notes of the LOMI Seminar (volume 
dedicated to L. D. Faddeev on his 50th birthday), 133 (1984), 38-50. 

Integrability of hamiltonian systems associated with quadrics (after J. 
Moser), Uspekhi 34, no. 5, 214. 

Some algebro-geometrical aspects of the Newton attraction theory, Pro- 
gress in Math. (I. R. Shafarevich volume), 36 (1983), 1—4. 

Magnetic analogues of the theorem of Newton and Ivory, Uspekhi 38, 
no. 5 (1983), 145-146. 


Further details on background material for the results in this appendix may 
be found in the following papers. 
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R. B. Melrose, Equivalence of glancing hypersurfaces, Invent. Math. 37 
(1976), 165-191. 

J. Moser, Various aspects of integrable Hamiltonian systems, in: J. Gucken- 
heimer and S. E. Newhouse, eds. “Dynamical systems”, CIME Lectures, 
Bressanone, Italy, June 1978, Cambridge, Mass., Birkhauser, Boston, 1980, 
pp. 233-289. 

V.I. Arnold, Lagrangian manifolds with singularities, asymptotical of rays, 
and unfoldings of the swallowtail, Funct. Anal. Appl. 15 (1981). 

V.I. Arnold, Singularities in variational calculus, J. Soviet Mathematics 27 
(1984), 2679-2713. 

A. B. Givental’, Polynomial electrostatic potentials (Seminar report, in 
Russian), Uspekhi Mat. Nauk 39, no. 5 (1984), 253-254. 

V. I. Arnold, On the Newtonian potential of hyperbolic layers, Selecta 
Math. Sovietica 4 (1985), 103-106. 

A. D. Vainshtein and B. Z. Shapiro, Higher-dimensional analogs of the 
theorem of Newton and Ivory, Funct. Anal. Appl. 19 (1985), 17-20. 


A Elliptic coordinates and confocal quadrics 


Elliptic coordinates in euclidean space are defined with the aid of confocal 
quadrics (surfaces of degree two). The geometry of these quadrics is obtained 
from the geometry of pencils of quadratic forms in euclidean space (i.e., from 
the theory of principal axes of ellipsoids or from the theory of small oscillations) 
by a passage to the dual space. 


Definition 1. A eucildean pencil of quadrics (resp. quadratic forms) in a euclidean 
vector space V is a one-parameter family of surfaces of degree two 


(Ax, x) = 1 
(resp. forms A,), where 


A,=A-—JAE (E = “identity”), 
and where A is a symmetric operator 
A VoV*, A*=A. 
Definition 2. A confocal family of quadrics in a euclidean space W is a family 
of quadrics dual to the quadrics of a euclidean pencil in W*: 
3(A;*6, €) = 1. 
Thus, quadrics which are confocal to one another form a one-parameter 


family, but the quadratic forms defining the family do not depend linearly on 
the parameter. 


EXAMPLE. The family of plane curves which are confocal to a given ellipse 
consists of all those ellipses and hyperbolas with the same foci. In Figure 248, 
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Figure 248 A confocal family and the corresponding euclidean pencil 


the curves of a confocal family are shown on the left, and the curves of the 
corresponding euclidean pencil are shown on the right. 


The elliptic coordinates of a point are the value of the parameter / for which 
the corresponding quadrics of a fixed confocal family pass through the point. 
We fix an ellipsoid in eucildean space with all its axes of different lengths. 


Theorem 1 (Jacobi). Through each point of an n-dimensional euclidean space 
there pass n quadrics confocal to a given ellipsoid. Smooth confocal quadrics 
intersect at right angles. 


Proor. Each point other than 0 in our space corresponds to an affine hyper- 
plane in the dual space, consisting of those linear functionals whose value is 
1 at the given point. In terms of the dual space, Theorem 1 means that every 
hyperplane not passing through 0 in an n-dimensional euclidean space is 
tangent to precisely n of the quadrics in a euclidean pencil, and the vectors 
from 0 to the points of tangency are pairwise orthogonal (Figure 248, right). 

The proof of the property of euclidean pencils just stated is based on the 
fact that the aforementioned vectors define the principal axes of the qua- 
dratic forms B = 4(Ax, x) — 4(I, x)?, where (I, x) = 1 is the equation of the 
hyperplane. 

As a matter of fact, on a principal axis of any quadratic form B, corre- 
sponding to the proper value /, the form B — AE reduces to 0 along with its 
gradient. The vanishing of this form at the point of intersection of the 
principal axis and the hyperplane means that the point of intersection lies on 
the quadric 3(Ax, x) = 1, while the vanishing of the gradient means that the 
quadric and the hyperplane are tangent at the point. O 


Theorem 2 (Chasles). Given a family of confocal quadrics in n-dimensional 
euclidean space, a line in general position is tangent ton — | different quadrics 
in the family, and the planes tangent to the quadrics at the points of tangency 
are pairwise orthogonal. 
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Proor. We project the quadrics in the confocal family along a pencil of parallel 
lines onto the hyperplane perpendicular to the pencil. Each quadric defines an 
apparent contour (the set of critical values of the projection of the quadric). 
For a projection whose direction is in general position, the apparent contour 
is a quadric (i.e., a surface of degree two) in the image hyperplane. 


Here we need a lemma. 


Lemma. The apparent contours of the quadrics in a confocal family form 
themselves a confocal family of quadrics. 


Proor. On passage to the dual, sections become projections and vice versa. 
The apparent contours of the projections of confocal quadrics along a pencil 
of parallel lines are therefore dual to the sections of the dual quadrics by a 
hyperplane passing through the origin. 

The sections of the quadrics in a euclidean pencil by a hyperplane through 
0 form a euclidean pencil of quadrics in the hyperplane. The lemma now 
follows by duality. O 


Returning to the proof of Theorem 2, we apply the lemma above to the 
projections along the line in the statement of the theorem. According to the 
lemma, the apparent contours of the projections of the confocal quadrics in 
Theorem 2 form a confocal family of quadrics in a hyperplane. By Theorem 1, 
n — 1 of these apparent contours pass through each point, where they intersect 
at right angles. This completes the proof of Theorem 2. oO 


Theorem 3 (Jacobi and Chasles). Given a geodesic on a quadric Q in n- 
dimensional space, there is a set of n — 2 quadrics confocal to Q such that all 
the tangent lines to the geodesic are also tangent to the quadrics in the set. 


Proor (Beginning). We consider the manifold of oriented lines in euclidean 
space. This manifold has a natural symplectic structure as the manifold of 
characteristics in the hypersurface p? = 1 in the phase space of a free particle 
moving under its own inertia in our euclidean space. 

(The characteristics on a hypersurface in a symplectic manifold are the 
integral curves of the field of characteristic directions, i.e., the field of directions 
which are skew-orthogonal to the tangent spaces of the hypersurface. In other 
words, the characteristics of the hypersurface are the phase curves for any 
hamiltonian flow whose hamiltonian function vanishes to first order on the 
hypersurface. 

The symplectic structure on the manifold of characteristics on a hyper- 
surface in a symplectic manifold is defined in such a way that the skew-scalar 
product of any two vectors tangent to the hypersurface is equal to the skew- 
scalar product of their projections in the manifold of characteristics. 

Note, finally, that the notion of characteristics is equally well defined for 
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any submanifold of a symplectic manifold on which the induced 2-form has 
constant nullity. The characteristics then have dimension equal to that nullity, 
and the manifold of characteristics still inherits a symplectic structure.) (1 


Lemma A. Each characteristic of the manifold of lines tangent to a given 
hypersurface in euclidean space consists of all the lines tangent to a single 
geodesic on the hypersurface. 


PROOF OF Lema A. For efficiency of expression, we will identify the cotangent 
vectors to euclidean space with tangent vectors by using the euclidean structure, 
so that our original phase space is represented as the space of vectors based 
at points of eucildean space (i.e.. momenta are identified with velocities). The 
unit vectors to the given hypersurface form a submanifold of odd codimension 
(equal to 3) in phase space. The characteristics of this submanifold define the 
geodesic flow on the hypersurface. 

The map which assigns to each vector the line in which it lies takes the 
codimension 3 submanifold just described to the manifold of lines tangent 
to the hypersurface. Under this mapping, characteristics are transformed to 
characteristics (with respect to the symplectic structure on the space of lines). 
This proves the lemma. O 


[Remark. The preceding argument may be easily extended to the following 
general situation, first considered by Melrose. Let Y and Z be a pair of hy- 
persurfaces in a symplectic manifold X which intersect transversally along a 
submanifold W. We consider the manifolds of characteristics B and C of the hy- 
persurfaces Y and Z together with the canonical quotient fibrations Y > B 
and Z —> C; the manifolds B and C inherit symplectic structures from X. 

In the intersection W, there is a distinguished hypersurface (of codimension 
3 in X) consisting of points at which the restriction to W of the symplectic 
structure on X is degenerate. This hypersurface © in W may also be defined 
as the set of critical points of the composed mapping WGY->B (or 
WZ —> C if one wishes). These objects form the following commutative 
diagram: 


The analogue to Lemma A in this situation is the assertion that the 
characteristics on the images of the mappings 2 > Band = — C are the images 
of one and the same curve on = (namely, the characteristics of © considered 
as a submanifold of the symplectic manifold X). 

Lemma A itself is the special case of the assertion above in which X = R?" 
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(the phase space of a free particle in R”), the hypersurface Y consists of the unit 
vectors (given by the condition p? = 1, ie., a level surface of the hamiltonian 
for a free particle), and the hypersurface Z consists of those vectors which are 
based at the points of the given hypersurface in R". In this case, B is the 
manifold of all oriented lines in euclidean space, and & is the manifold of unit 
vectors tangent to the hypersurface. The mapping 2 — B assigns to each unit 
vector the line which contains it. The manifold C is the (co)tangent bundle of 
the given hypersurface. 2 + C is the embedding into this bundle of its unit 
sphere bundle (in other words, the embedding of a level surface of the kinetic 
energy, i.e., the hamiltonian for motion constrained to the hypersurface). 

It is always useful to keep the diagram above in mind when one is dealing 
with constraints in symplectic geometry. ] 


PROOF OF THEOREM 3 (Middle). We suppose given a smooth function on 
euclidean (configuration) space whose restriction to a certain line has a non- 
degenerate critical point. In this situation, the function will also have a critical 
point when restricted to each nearby line; i.e., on each nearby line, there will 
be a nearby point where the line is tangent to a level surface of the function. 
The value of the function at the critical point is thus a function (defined locally) 
on the space of lines. We call this function of lines the induced line function 
(from the original point function). O 


Lemma B. If two point functions in euclidean space are such that the tangent 
planes to their level surfaces are orthogonal at the points where a given line 
is tangent to these surfaces (these points being in general different for the 
two functions), then the Poisson bracket of the induced line functions is zero 
at the given line (considered as a point in the space of lines). 


PROOF OF LEMMA B. We calculate the derivative of the second induced line 
function along the phase flow whose hamiltonian is the first induced function. 
The phase curves for the first induced function, which lie on its level surfaces, 
are the characteristics of those surfaces. A level surface for the first induced 
function consists of those lines which are tangent to a single level surface of 
the first point function. Each characteristic of this surface, according to 
Lemma A, consists of the lines which are tangent to a single geodesic on the 
level surface of the first point function. 

For an infinitesimally small displacement of a point on a geodesic in 
a surface, the tangent line to the geodesic rotates (up to infinitesimal quantities 
of higher order) in the plane spanned by the original tangent and the normal 
to the surface. By hypothesis, the tangent plane to the level surface of the 
second function at the point where this surface is tangent to our line is 
perpendicular to the tangent plane of the level surface of the first function. 
Therefore, under the above-mentioned infinitesimally small rotation, the line 
remains tangent to the same level surface of the second function (up to 
infinitesimals of higher order). It follows that the rate of change of the second 
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induced function under the action of the phase flow given by the first is zero 
at the element in question of the space of lines, which proves Lemma B. [] 


PROOF OF THEOREM 3 (End). We fix a line in general position in R". According 
to Theorem 2, this line is tangent to n — 1 quadrics in the confocal family, at 
n—1 points. We construct in the neighborhood of each of these points a 
smooth function, without critical points, whose level surfaces are the quadrics 
of our confocal family. 

We fix one of these quadrics (the “first”) and consider the hamiltonian 
system on the space of lines whose hamiltonian function is the first induced 
line function. Each of its phase curves on a fixed level surface of the ham- 
iltonian function consists of the tangent lines to one geodesic of that quadric 
(Lemma A). The remaining induced functions have zero Poisson bracket with 
the hamiltonian, by Lemma B (since the planes tangent to the confocal 
surfaces at the points where they touch one line are orthogonal, by Theorem 2). 

Thus all the induced functions are first integrals for the hamiltonian system 
generated by any one of them. Since the lines tangent to a geodesic on the first 
quadric form a phase curve of the first system, all the induced functions take 
constant values on this curve. That proves Theorem 3, as well as the following 
result. oO 


Theorem 4. The geodesic flow ona central surface of degree 2 in euclidean space 
is a completely integrable system in the sense of Liouville (i.e., it has as many 
independent integrals in involution as it has degrees of freedom). 


Remark. Strictly speaking, we proved Theorem 3 only for lines in general 
position, but the result extends by continuity to the exceptional cases (in 
particular, to asymptotic lines of our quadrics). In the same way, Theorem 4 
was initially proved just for quadrics with unequal principal axes, but passage 
to a limit extends the result to more symmetric quadrics of revolution (as well 
as to noncentral “paraboloids”). 


B Magnetic analogues of the theorems of Newton and Ivory 


Elliptic coordinates make it possible to extend Newton’s well-known theorem 
on the gravitational attraction of a sphere to the case of attraction by an 
ellipsoid. 


Definition. A homeoidal density on the surface of an ellipsoid E is the density 
of a layer between E and an infinitely nearby ellipsoid which is homothetic 
to E (with the same center). 


The following is a well-known result. 


Ivory’s Theorem. 4 finite mass, distributed on the surface of an ellipsoid with 
homeoidal density, does not attract any internal point; it attracts every 
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external point the same way as if the mass were distributed with homeoidal 
density on the surface of a smaller confocal ellipsoid. 


The attraction in Ivory’s theorem is defined by the law of Newton or 
Coulomb: in n-dimensional space, the force is proportional to r~" (as pre- 
scribed by the fundamental solution of Laplace’s equation). 

Newton’s theorem on the (non)attraction of an internal point carries over 
to the case of a hyperbolic homeoidal layer and to the case of an attracting 
mass distributed on a level hypersurface of a hyperbolic polynomial of any 
degree. (A polynomial of degree m, f(x,,...,X,) is called hyperbolic if its 
restriction to any line through the origin has all its roots real.) 

A homeoidal charge density on the zero hypersurface f = 0 of a hyperbolic 
polynomial is defined as the density of a homogeneous infinitesimally thin 
layer between the hypersurfaces f = 0 and f = ¢ — 0 (the signs of the charges 
being chosen so that successive ovaloids have opposite charges). 

[A homeoidal charge does not attract the origin (nor any other point within 
the innermost ovaloid), and this property is preserved if the charge density is 
multiplied by any polynomial of degree at most m — 2. 

Generalization: Ifa homeoidal charge density is multiplied by any polynomial 
of degreem — 2 +r, then the potential inside the innermost ovaloid is a harmonic 
polynomial of degree r (A. B. Givental’, 1983).] 

When one attempts to find a version for hyperboloids of Ivory’s theorem 
on the attraction of confocal ellipsoids, it turns out that an essential role is 
played by the topology of the hyperboloids. When passing to hyperboloids 
of different signatures, one must consider, instead of homeoidal densities, 
harmonic forms of different degrees, and instead of the Newton or Coulomb 
potential, the corresponding generalized forms-potentials given by the Biot— 
Savart law. 

In the simplest nontrivial case of a hyperboloid of one sheet in three- 
dimensional euclidean space, the result is as follows. 

The hyperboloid divides space into two parts: “internal” and “external,” 
the latter being nonsimply connected. We consider elliptic coordinate curves 
from the system whose level surfaces are the quadrics confocal to the given 
hyperboloid. 

The elliptic coordinate curves on our hyperboloid, which are obtained 
by intersecting with the confocal ellipsoids (closed lines of curvature on 
the hyperboloid), are called the parallels of the hyperboloid. The orthogonal 
curves, obtained by intersection with the two-sheeted hyperboloids, are called 
the meridians. 

Although the elliptic coordinate system has singularities (on each symmetry 
plane of the quadrics in the family), the hyperboloid is smoothly fibred by 
the parallels (diffeomorphic to the circle) and meridians (diffeomorphic to 
the line). 

The region inside the hyperboloidal tube is also smoothly fibred by meri- 
dians (orthogonal to the ellipsoids in the confocal family), while the annular 
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Figure 249 Magnetic fields generalizing the theorems of Newton and Ivory 


region outside the hyperboloid is smoothly fibred by parallels (orthogonal to 
the hyperboloids of two sheets). 


Theorem. A current with a suitable density, flowing along the meridians of a 
hyperboloid, produces a magnetic field which is zero inside the hyperboloidal 
tube, while the field in the annular exterior region is directed along the 
parallels. A current with a suitable density, flowing along the parallels of a 
hyperboloid, produces a magnetic field which is zero in the exterior annular 
region, while the field inside the hyperboloidal tube is directed along the 
meridians. (See Figure 249.) 


The current densities giving rise to such magnetic fields, which generalize 
the homeoidal charge densities on ellipsoids, may be described in the following 
way. There are associated to each family of confocal quadrics in three- 
dimensional euclidean space two “focal curves”: an ellipse and a hyperbola. 
(See Figure 250.) The focal ellipse is the boundary of the limiting ellipsoid of 
the family in which the shortest axis shrinks to zero; the focal hyperbola arises 
in a similar way from the hyperboloids of one or two sheets. 


Figure 250 Focal ellipse and focal hyperbola 
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We define a homeoidal density on a focal ellipse in the following way. To 
begin we consider any nonplanar parallel, defined as the nonplanar inter- 
section of an ellipsoid with a hyperboloid of one sheet. A homeoidal density 
on this parallel is defined as the density on an infinitesimally thin “wire,” 
obtained by intersecting the layer between the given ellipsoid and a homothetic 
one infinitesimally nearby with the layer between the given hyperboloid and 
a homothetic one infinitesimally close by, both homotheties being taken with 
respect to the center of the confocal family. We normalize this homeoidal 
density on the parallel in such a way that the mass of the entire parallel is equal 
to 1. 

Now we consider the focal ellipse as a limit of nonplanar parallels. It turns 
out that the normalized homeoidal densities on the parallels have a well- 
defined limit as the parallels approach the focal ellipse. This limiting density 
is called the homeoidal density on the focal ellipse. 

The homeoidal density on a focal hyperbola is defined in an analogous way. 

We may now describe the current densities referred to as “suitable” in the 
theorem above on magnetic fields. The surface of a hyperboloid of one sheet 
is fibred over the focal ellipse (the fibre over a point is the meridian which lies 
on the same hyperboloid of two sheets as that point). 

The flux of the meridianal current suitable for the theorem, through any curve 
on the hyperboloid, equals the integral of the homeoidal density form on the 
focal ellipse over the projection of that curve onto the focal ellipse (along the 
hyperboloids of two sheets). 

The density of the flow along the parallels is induced in an analogous way 
from the homeoidal density on the focal hyperbola. 


Remark. The magnetic field of the parallel flow with the indicated density, 
inside the hyperboloidal tube, coincides outside each confolal ellipsoid (up to 
sign) with the newtonian or coulombian field produced by a charge which is 
distributed with homeoidal density on that ellipsoid.!?* 

In exactly the same way, the magnetic field in the annular domain outside 
the hyperboloid of one sheet coincides (up to sign), in the region between the 
sheets of each confocal hyperboloid of two sheets, with the coulombian field 
produced by two equal charges with opposite signs distributed on the two 
sheets of the hyperboloid with homeoidal density (O. P. Shcherbak). 

The results formulated above have recently been extended by B. Z. Shapiro 
and A. D. Vainshtein to hyperboloids in euclidean spaces of any number of 
dimensions. For a hyperboloid in R", diffeomorphic to S* x R', a harmonic 
k-form is constructed on the exterior region (diffeomorphic to the product of 
S* with a half-space) and a harmonic /-form is constructed on the interior. 

The corresponding homeoidal densities are defined on the focal ellipsoid 
with codimension k and the focal hyperboloid of two sheets with codimension 


124 This is actually the density with which a charge will distribute itself on the surface of a 
conducting ellipsoid. 
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| by the same limiting procedure that we described above for k =1 = 1, 
using the intersections of layers between infinitesimally close and homothetic 
quadrics. 

Noncomputational proofs of these geometric theorems are unknown, even 
for the special case of magnetic fields in three-dimensional space. 


Remark. The presence of distinguished harmonic forms on hyperboloids 
and in their complementary domains suggests that one might try to find 
filtrations, analogous to those arising in the theory of mixed Hodge structures, 
in spaces of differential forms on noncompact (and possibly even singular) 
algebraic and semialgebraic real manifolds. 
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The simplest example of a ray system is the system of normals to a surface in 
euclidean space. 

In a neighborhood of a smooth surface, its normals form a smooth fibration, 
but at some distance from the surface various normals begin to intersect one 
another (Figure 251). The complicated figures which are thereby formed were 
already investigated by Archimedes, but their full details were not revealed 
until the discovery in 1972 of the relation between singularities of ray systems 
and the theory of groups generated by reflections. 

This relation, for which there is no evident a priori reason (and which is as 
surprising as, say, the relation between the problems of tangents and areas), 
has turned out to be a powerful instrument for the study of critical points of 
functions. By 1978, it had become clear that the theory of reflection groups 
also governs the singularities of the Huygens evolvents. 

Huygens (1654) discovered that the evolvent of a plane curve has a cusp 
singularity at each point where it meets the curve (Figure 252). Evolents of 
plane curves and their higher-dimensional generalizations are wave fronts 
on manifolds with boundary. Singularities of wave fronts, like those of ray 
systems, are classified in terms of reflection groups. 

While rays and fronts on manifolds without boundary are related to the 
Weyl groups in the A, D, and E series, singularities of evolvents are described 
by the groups of types B, C, and F (the ones with double connections in their 
Dynkin diagrams). 

The remaining reflection groups (I,(p), H3, H,) continued for some time to 
have no visible relation to the theory of singularities. This situation changed 
in the fall of 1982 when it was discovered that the symmetry group H, of the 
icosahedron governs the singularities of evolvent systems in the neighborhood 
of inflection points of plane curves. 

The appearance of the icosahedron at an inflection point of a curve looks 
as mystical as the icosahedron in Kepler’s law of planetary distances. But the 
presence of the icosahedron here is not an accident: upon the investigation in 
1984 of more complicated systems of rays and fronts, the remaining group H, 
appeared. 

We shall give in this appendix a brief description of the theory of singularities 
of ray systems. Further details may be found in the following references: 


V. I. Arnold, Singularities of ray systems, Russian Math. Surveys 38 (1983). 

V. I. Arnold, Singularities in variational calculus, J. Soviet Math. 27 
(1984), 2679-2713. 

O. V. Lyashko, Classification of critical points of functions on a manifold 
with singular boundary, Funct. Anal. Appl. 17 (1983), 187-193. 

O. P. Shcherbak, Singularities of families of evolvents in the neighborhood 
of an inflection point of the curve, and the group H;, generated by relections, 
Funct. Anal. Appl. 17 (1983), 301-303. 

A. N. Varchenko and S. V. Chmutov, Finite irreducible groups, generated 
by relections, are monodromy groups of suitable singularities, Funct. Anal. 
Appl. 18 (1984), 171-183. 
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Figure 251 A caustic as the envelope of rays 


Figure 252 An evolvent of a curve 


V. I. Arnold, Singularities of solutions of variational problems (Seminar 
report, in Russian), Uspekhi Mat. Nauk 39, no. 5 (1984), 256. 

O. P. Shcherbak, Wave fronts and reflection groups. Russian Math. Surveys, 
43, no. 3 (1988). 

Itogi Nauki i Tekhniki, Sovremennye Problemy matematiki, Noveishie 
dostijenia, Moscow, VINITI, vol. 33 (1988). English translation: J. Sov. Math. 
27 (1984). 


Many of the results which we will describe concern such simple geometric 
objects that it is surprising that they were not already known in classical times. 
For instance, the local classification of projections of generic surfaces in 
three-dimensional space was not discovered until 1981. The number of equi- 
valence classes of germs of projections turned out to be finite—namely 14: 
neighborhoods of points on generic surfaces can have that many different 
appearances when viewed from different points in space. 


A Symplectic manifolds and ray systems 


1. The space of oriented lines in euclidean space may be identified with 
the (co)tangent bundle of the sphere (Figure 253), and it thereby obtains a 
symplectic structure. 

2. More generally, we consider any hypersurface in a symplectic manifold. 
The skew-orthogonal complement to its tangent space at each point is called 
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Figure 253 The space of oriented lines in euclidean space 


the characteristic direction. The integral curves of the field of characteristic 
directions on a hypersurface are called characteristics. The manifold of char- 
acteristics inherits a symplectic structure from the original manifold. 

3. In particular, the manifold of extremals of a general variational problem 
carries a symplectic structure. 

4. We consider the space of binary forms (homogeneous polynomials in 
two variables) of a particular odd degree. The group of linear transformations 
of the plane acts on this even dimensional linear space. Up to multiplication 
by a constant, there is a unique nondegenerate skew-symmetric form on this 
space which is invariant under the action of the group SL(2) of linear trans- 
formations with determinant equal to 1. This form gives a natural symplectic 
structure on the manifold of binary forms of each odd degree. 

5. The binary forms in x and y for which the coefficient of x is unity 
form a hypersurface in the space of all forms. The manifold of characteristics of 
this hypersurface is naturally identified with the manifold of monic polynomials 
of even degree x?* + --- in x. We have thereby defined a natural symplectic 
structure on this space of polynomials. 

6. The one-parameter group of translations along the x-axis preserves the 
symplectic structure just introduced. The hamiltonian function for this group 
is a quadratic polynomial (found already by Hilbert (1893)). The manifold 
of characteristics for any level surface of this hamiltonian function may be 
identified with the manifold of monic polynomals of degree 2k — 1 inx for which 
the sum of the roots is zero. Thus we have a natural symplectic structure on 
this space of polynomials. 


2k+1 


B Submanifolds of symplectic manifolds 


The restriction of a symplectic structure to a submanifold is a closed 2-form, 
but it is not necessarily nondegenerate. For submanifolds in euclidean space 
there is, in addition to the intrinsic geometry, an extensive theory of extrinsic 
curvatures. In symplectic geometry, the situation is simpler: 
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Theorem (A. B. Givental’, 1981). The restriction of the symplectic form to a 
germ of a submanifold in a symplectic manifold determines the germ up to a 
symplectic diffeomorphism of the ambient manifold. 


An intermediate theorem, in which one uses the values of the symplectic 
form at all vectors based on the submanifold, not just those tangent to it, was 
proved earlier by A. Weinstein (1971). Unlike Weinstein’s theorem, Givental’s 
theorem makes it possible to classify generic submanifold germs in symplectic 
manifolds: it is sufficient to use the classification of degenerate symplectic 
structures obtained by J. Martinet (1970) and his successors. 


ExampLes. 1. A generic two-dimensional surface in symplectic space is sym- 
plectically diffeomorphic in a neighborhood of each point with the surface 
P2 = P?, P3 = 43 = *** = 0 (in Darboux coordinates). 2. On four-dimensional 
submanifolds, one finds stable curves of elliptic and hyperbolic Martinet 
singular points with normal forms 


Po = P1P3 + 9192 + 93/6, p3 = 9, Pa=% =" = 0. 


[The ellipticity or hyperbolicity of a singular point is determined by the 
nature of the dynamical system invariantly attached to the submanifold. The 
divergence-free vector fields in three-dimensional space which arise have 
entire curves of singular points. The classification of singular lines turns out 
to be less pathological than the classification of singular points (which is 
almost as difficult as all of celestial mechanics). ] 

This concludes a description of the first steps in the theory of symplectic 
singularities on smooth manifolds. 


C Lagrangian submanifolds in the theory of ray systems 


We recall that a lagrangian submanifold is a submanifold of symplectic space 
on which the symplectic structure pulls back to zero and which has the highest 
possible dimension consistent with this property (equal to half the dimension 
of the ambient manifold). 


EXAMPLES. 1. Each fibre of a cotangent bundle is lagrangian. 2. The manifold of 
all oriented normals to a smooth submanifold (of any dimension) in euclidean 
space is a lagrangian submanifold of the space of lines. 3. The manifold of all 
polynomials x?™ + --- divisible by x™ is lagrangian. 


A lagrangian fibration is a fibration all of whose fibres are lagrangian. 


EXAMPLES. 1. The cotangent fibration is lagrangian. 2. The Gauss fibration 
from the space of lines in euclidean space to the unit sphere of directions is 
lagrangian. 
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All lagrangian fibrations of a fixed dimension are locally (on a neighbor- 
hood of a point in the total space) symplectically diffeomorphic. 

A lagrangian mapping is the projection of a lagrangian submanifold to the 
base of a lagrangian fibration, i.e., a triple V > E — B, where the first arrow 
is an immersion onto a lagrangian manifold and the second arrow is a 
lagrangian fibration. 


EXAMPLES. 1. A gradient mapping q+ 0S/dq is lagrangian. 2. The normal 
mapping which maps each normal vector of a submanifold in euclidean space 
to its tip is lagrangian. 3. The Gauss mapping which takes each point of a 
transversely oriented hypersurface in euclidean space to the unit vector at 
the origin in the direction of the normal is lagrangian. (The corresponding 
lagrangian manifold consists of the normals themselves.) 


An equivalence of lagrangian mappings is a fibre-preserving symplectic 
diffeomorphism of the total spaces of the fibrations which takes the first 
lagrangian manifold to the second. 

The set of critical values of a lagrangian mapping is called a caustic. The 
caustics of equivalent mappings are diffeomorphic. 


EXAMPLE. The caustic of the normal mapping of a surface is the envelope of 
the family of normals, ie., the focal surface (surface of centers of curvature). 


Every lagrangian mapping is locally equivalent to a gradient (or normal, 
or Gauss) mapping. The singularities of generic gradient (or normal, or Gauss) 
mappings are the same as those for arbitrary generic lagrangian mappings. 
The simplest of these are classified by the reflection groups A,, D,, E¢, Ez, Es 
(see Appendix 12). 


EXAMPLE. We consider a medium of dust particles moving inertially, with 
their initial velocities forming a potential field. After time t, the particle at x 
moves to x + t(0S/0x). We thereby obtain a one-parameter family of smooth 
mappings R? > R?. 

These mappings are lagrangian. In fact, a potential field of velocities gives 
a lagrangian section of the cotangent bundle. The phase flow of Newton’s 
equations preserves the lagrangian property. For large t, though, our lagrangian 
manifold is no longer a section: its projection on the base develops singular- 
ities. The caustics of the corresponding lagrangian mappings are places where 
the density of particles has become infinite.'?° According to Ya. B. Zel'dovich 
(1970) an analogous model (taking into account gravity and the expansion of 


125 The relation between caustics and dust-like media was first discovered by Lifshitz, Sudakov, 
and Khalatnikov: see the survey by E. M. Lifshitz and I. M. Khalatnikov, Investigations in 
relativistic cosmology, Adv. Phys. 12 (1963), 185. 


484 


Appendix 16: Singularities of ray systems 


=2 SS 


=VWs 
== 
=-= 
ABS 


aoeds-¢ Ul SONSNvd Jo SBYIONseag ¢S7Z 9N3I4 


bee 
A 
eee 


V A 


ie 


els Se 
MAK S 


VonWwe = 


A 


_— 


485 


Appendix 16: Singularities of ray systems 


the universe) describes the formation of large scale nonhomogeneities in the 
distribution of matter in the universe. 

According to the theory of Lagrange singularities, the newborn caustics 
have the form of elliptic saucers (Figure 254) (after time t from the moment of 
birth, a saucer has length of order t?, depth of order t, and thickness of order 
t??). The birth of a saucer corresponds to A,. The metamorphoses of caustics 
which occur in generic one-parameter families of lagrangian mappings are 
shown in Figure 255 (V. I. Arnold, Wave fronts evolution and equivariant 
Morse lemma, Comm. Pure Appl. Math. 6 (1976), 319-335). 


Theorem (1972). The germs at each point of generic lagrangian mappings 
between manifolds of dimension <5 are simple (i.e., having no moduli) and 
stable. The simple stable germs of lagrangian mappings are classified by the 
reflection groups A, D, E, in a way which will be explained below. 


D Contact geometry and systems of rays and wave fronts 


We recall that a contact structure on an odd-dimensional smooth manifold 
is a nondegenerate field of tangent hyperplanes. The specific condition of 
nondegeneracy is inessential here, since near generic points, all generic hy- 
perplane fields on manifolds of a fixed odd dimension are diffeomorphic 
(Darboux’s theorem for contact structures, Appendix 4). 


EXAMPLES. 1. The manifold of contact elements of a smooth manifold consists 
of all its tangent hyperplanes. The rate of change of a contact element belongs 
to the contact structure if and only if the rate of change of the point of contact 
(ie., the point where the hyperplane is tangent to the manifold) belongs to the 
contact element itself. 2. The manifold of 1-jets of functions y = f(x) has a 
contact structure dy = p dx (p = Of/0x for the 1-jet of a function f). 


The extrinsic geometry of a submanifold of contact space is locally deter- 
mined by the intrinsic geometry (Givental’s theorem on contact structures). 

Integral submanifolds of a contact structure are called Legendre (or 
legendrian) submanifolds if they have the largest possible dimension. 


EXAMPLES. |. The set of all contact elements tangent to a fixed submanifold 
(of any dimension) is a Legendre submanifold. 2. In particular, all contact 
elements at a given point form a Legendre submanifold (a fibre of the bundle 
of contact elements). 3. The set of all the 1-jets of a single function is a Legendre 
submanifold in the space of 1-jets. 


A fibration is called a Legendre fibration if its fibres are Legendre 
submanifolds. 


EXAMPLES. |. The projective cotangent fibration (attaching each contact ele- 
ment to its point of contact) is Legendre. 2. The fibration of 1-jets of functions 
over the 0-jets (forgetting the derivative) is Legendre. 
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All Legendre fibrations of a fixed dimension are locally contact diffeo- 
morphic (in a neighborhood of a point in the total space of the fibration). 

The projection of a Legendre submanifold on the base of a Legendre 
fibration is called a Legendre mapping. The image of a Legendre mapping is 
called its front. 


EXAMPLES. 1. The Legendre transformation: A hypersurface in projective space 
may be lifted to the space of contact elements of projective space as a Legendre 
submanifold. The manifold of contact elements of projective space is also 
fibred over the dual projective space. (The fibration assigns to each contact 
element the plane containing it.) This is a Legendre fibration. The projection 
of the lifted Legendre submanifold maps it onto the hypersurface which is 
projectively dual to the original one. Thus, the projective dual of a smooth 
hypersurface is the front of a Legendre mapping. 2. Frontal mappings: Laying 
out a segment of length t on each normal to a hypersurface in euclidean space, 
we obtain a Legendre mapping whose front is equidistant from the given 
hypersurface. 


Every Legendre mapping is locally equivalent to a Legendre transforma- 
tion, as well as to a frontal mapping. The theory of Legendre singularities thus 
coincides exactly with the theory of singularities of Legendre transformations 
and of frontal mappings. Equivalence, stability, and simplicity of Legendre 
mappings are defined just as the lagrangian case. 


Theorem (1973). The germs, at all points, of generic Legendre mappings between 
manifolds of dimension <5 are simple and stable. The simple and stable 
germs of Legendre mappings are classified by the groups A, D, E: their fronts 
are locally diffeomorphic (in the complex domain) to the manifolds of non- 
regular orbits of the corresponding reflection groups. 


EXAMPLE. The only singularities of a typical wave front in three-dimensional 
space are (semicubic) cuspidal curves (A,) and “swallowtails” (A,, Figure 256; 
near such a point, the front is diffeomorphic to the surface formed by the poly- 
nomials with multiple roots in the space of polynomials x* + ax? + bx +4 c). 


Figure 256 Singularities of wave fronts 
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Of course, there may also be transverse intersections of branches of fronts of 
the types just described. 


Remark. The real forms of simple singularities of fronts may also be described 
in terms of reflection groups. E. Looijenga has shown that the real components 
in the complement of a simple germ of a front may be identified with the 
conjugacy classes of involutions (elements of order 2) in the normalizer of 
the reflection group, conjugacy being taken with respect to the reflection 
group itself. (See E. Looijenga, The discriminant of a real simple singularity, 
Compositio Math. 37 (1978), 51-62.) 


E Applications of contact geometry to symplectic geometry 


All lagrangian singularities may be obtained from Legendre singularities, if 
one realizes the latter by projections of Legendre submanifolds of the space 
of 1-jets of functions onto the space of 0-jets. If one forgets the value of each 
function, the space of 1-jets is projected onto phase space (i.e., the cotangent 
bundle); a Legendre submanifold in the first space projects to a lagrangian 
submanifold in the second. In particular, the caustic of a lagrangian mapping 
is the image of the cuspidal edge of the front of a Legendre mapping under a 
projection with one-dimensional fibres. 


Theorem (O. V. Lyashko, 1979). All holomorphic vector fields transverse to the 
front of a simple singularity are locally equivalent under holomorphic diffeo- 
morphisms preserving the front. 


EXAMPLE. A generic vector field in the neighborhood of the most singular 
point of a swallowtail {x* + ax? + bx + c =(x + d)?...} is equivalent, by 
a holomorphic diffeomorphism preserving the swallowtail, to the normal form 
0/0c (Figure 257). 


The reduction of various objects to normal form, by a diffeomorphism 
preserving a wave front or caustic, is a basic technique for studying the 
geometry of systems of rays and fronts. For instance, the study of the meta- 


Figure 257 The normal form of a vector field at the swallowtail 
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Figure 258 Perestroikas of wave fronts 


morphoses of moving wave fronts is based on the following result, which is 
“dual” to the previous one. 


Theorem (1976). All generic holomorphic functions equal to zero at the most 
singular point of a simple singularity of a front are locally equivalent under 
holomorphic diffeomorphisms which preserve the front. 


EXAMPLE. In a neighborhood of the most singular point of a swallowtail, a 
generic function may be reduced, by a diffeomorphism preserving the swallow- 
tail, to the normal form a. 


This theorem is a special case of the equivariant Morse lemma. It is applied 
in the following way. The instantaneous wave fronts together form a “large 
front” in space-time. “Time” is a function on space-time. We reduce this 
function to normal form by a diffeomorphism which preserves the front, and 
we thereby obtain a normal form for the metamorphoses of the instantaneous 
fronts. The metamorphoses of fronts in R* are shown in Figure 258. The 
problem of describing the metamorphoses of caustics in generic one-parameter 
families (Figure 255) is solved in exactly the same way. In this case, the time 
function is reduced to normal form by a transformation of space-time which 
preserves the “large caustic.” If the dimension of space-time is no larger than 
4, then all the singularities of the large caustic are of types A and D. 

The caustics of lagrangian singularities in the A series differ from the wave 
fronts in the A series only by a shift of 1 unit in the index. The same is therefore 
true for their metamorphoses. 

The caustics in the D series are not the same as the fronts. The normal forms 
for a generic time function in the neighborhood of a caustic singularity of type 
D were found by V. M. Zakalyukin (1975). The topological normal forms for 
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the time function are especially simple: 


Caustic Real case Complex case 
D, A, tA, Ay tA, 
Dj Ay tz Ag Ay + Ag Ay +A, 
Daya +A, Ay 
Dy, k = 3 A, tA, A, +A, 


Here, the large caustic D, is the set of 4 for which ¥(-, 4) has a degenerate 
critical point, where 


A 
xp m = ae ttt Ay-2X1 + 2A,X2 (u 2 4). 


1 
F(x, A) = +x3x, + a1 

The reduction to normal form of the germ of the time function is accom- 
plished by a local homeomorphism of the space R“~' (C“~'), which preserves 
the large caustic and which is smooth everywhere except at 0 (V. I. Bakhtin, 
1984). 

J. Nye (1984) has noticed that not all metamorphoses of caustics and fronts 
may be realized by the motion of a front under an equation of eikonal (or 
Hamilton-Jacobi) type. For example, the caustic of a ray system cannot have 
the form of “lips” with two cusps (although this is possible for lagrangian 
caustics). The point is that the inclusion of a lagrangian or Legendre manifold 
in the hypersurface given by a Hamilton-Jacobi or eikonal equation imposes 
topological restrictions on the coexistence, and thus on the metamorphoses, 
of singularities, even though the individual singularities may be realized on 
hypersurfaces. This is namely the case when the level surface of the hamiltonian 
is locally nondegenerately convex in the momentum variables. 

The vector fields generating the diffeomorphisms preserving a front are 
those which are tangent to it. The study of these vector fields leads to an 
unusual “convolution” operation on the invariants of a reflection group. 
To a pair of invariants (functions on the orbit space) we associate a new 
invariant—the scalar product of the gradients of the functions (pulled back 
from the orbit space to the original euclidean space). 

The linearization of this operation defines a symmetric bilinear mapping 
from each cotangent space of the orbit space into itself. 


Theorem (1979). The linearized convolution of invariants of a reflection group 
is isomorphic as a bilinear operation to the operation on the local algebra of 
the corresponding singularity given by the formula (p, q)+> S(p-q), where 
S=D+(2/h)E, D is Euler’s quasi-homogeneous derivation, and h is the 
Coxeter number. 


In 1981, A. N. Varchenko and A. B. Givental’ (who also proved the theorem 
above for the exceptional groups) found a far-reaching generalization of this 
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result. They replaced the euclidean structure by the intersection form of 
the underlying period mapping, which arises from a family of holomorphic 
differential forms on the fibres of the Milnor fibration of a versal family of 
functions. A nondegenerate intersection form defines (depending on the parity 
of the number of variables) either a locally flat pseudo-euclidean metric with 
a standard singularity on the Legendre front or a symplectic structure which 
extends holomorphically to the front. 


EXAMPLE. The space of monic polynomials with odd degree and sum of the 
roots equal to zero acquires yet another symplectic structure. Relative to this 
structure, the submanifold of polynomials with the maximal number of double 
roots turns out to be lagrangian. 


When the intersection form is indefinite, the symplectic structure is replaced 
by a Poisson structure (see Appendix 14). 


F Tangential singularities 


The first applications of the theory of lagrangian and Legendre singularities, 
around which the theory itself developed (~ 1966), concerned short wave 
asymptotics in the form of the asymptotics of oscillatory integrals. A survey 
of these applications (including the determination of uniform estimates for 
oscillatory integrals when saddle points meet, the calculation of asymptotics 
using Newton polyhedra, the construction of mixed Hodge structures, appli- 
cations to number theory and the theory of convex polyhedra, and estimates 
of the index of singular points of vector fields and the number of singular 
points of algebraic surfaces) may be found in the book: 


V. I. Arnold, A. N. Varchenko, and S. M. Gusein-Zade, “Singularities of 
Differentiable Mappings,” Vol. 2, Monodromy and Asymptotics of Integrals, 
Moscow, Nauka, 1984. English translation: Birkhauser, 1988. 


and in the paper 


V. I. Arnold, Singularities of ray systems, Proceedings of the International 
Congress of Mathematicians, August 16-24, 1983, Warsaw. 


Here we shall present other applications of the theory of lagrangian and 
Legendre singularities to the study of the configurations of projective mani- 
folds and tangential planes of various dimensions. One is led to such problems 
from variational problems with one-sided constraints (such as the obstacle 
problem), as well as from the study of Nekhoroshev’s exponent of roughness 
for unperturbed hamiltonian functions (see Appendix 8). 

We consider a generic surface in three-dimensional projective space (Figure 
259). The curve of parabolic points (p) divides the surface into a domain of 
elliptic points (e) and a domain of hyperbolic points (h); the latter domain 
contains the curve of inflection points of the asymptotic lines (f), with its 
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Figure 259 Projective classification of points of a surface 


points of biinflection (b), self-intersection (c), and tangency to the parabolic 
curve (t), 

From this classification of points, one may derive both estimates of curva- 
ture exponents and the following classification of projections. 


Theorem (O. A. Platonova and O. P. Shcherbak, 1981). Every projection from 
a point outside a generic surface in RP? is locally equivalent at each point of 
the surface to the projection along lines parallel to the x-axis of a surface 
z = f(x, y), where f is one of the following 14 functions: 


x, x7, x3 + xy, x2 + xy?, x3 + xy3, x4 + xy, 


x4 + x?7y + xy?, x8 + x3y + xy, x9 + xyt, xt + x? y + xy3, x5 + xy. 


By a projection we mean here a diagram V > E > B consisting of an 
embedding and a fibration; an equivalence of projections is then a 3 x 2 
commutative diagram whose vertical arrows are diffeomorphisms. 

The only singularities of the projection from a generic center are folds and 
Whitney tucks. The tucks appear when the projection is along an asymptotic 
direction. The remaining singularities are visible only from special points. The 
finiteness of the number of singularities of projections (and therefore the number 
of singularities of apparent contours) was not obvious before the result above 
was obtained, since there is a continuum of inequivalent singularities for 
generic three-parameter families of mappings from a surface to the plane. 

The regions of space from which the generic surface has a different appear- 
ance, as well as the corresponding views of the surface, are shown in Figure 260 
(for the most complicated cases). 

The hierarchy of tangential singularities becomes more comprehensible 
when it is reformulated in terms of symplectic and contact geometry. R. 
Melrose (1976) observed that the rays tangent to a surface are described by 
a pair of hypersurfaces in symplectic phase space: one of them, p? = 1, is 
defined by the metric; the other is defined by the surface. 

A significant part of the geometry of asymptotic lines may be reformulated 
in terms of this pair of hypersurfaces. In this way, we may transfer concepts 
from the geometry of surfaces to the more general case of arbitrary pairs of 
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Figure 260 The perestroikas of the visible contours of surfaces 
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hypersurfaces in symplectic space, and thereby use the geometric intuition 
gained from surface theory to study general variations problems with one- 
sided phase constraints. 

Let Y and Z be hypersurfaces in the symplectic space X which intersect 
transversely along a submanifold W. Projecting Y and Z onto their manifolds 
of characteristics, we obtain the hexagonal diagram 


in which 2 is the common manifold of critical points for the projections of W 
on U and V. 


EXAMPLE. Let X be the {q, p} phase space for a free particle in euclidean space 
(q is the position of the particle, p its momentum). Y is the manifold of unit 
vectors (p* = 1). Z is the manifold of vectors at the boundary (q belongs to 
a hypersurface I). Then U is the manifold of rays, V is the tangent bundle of 
the boundary I, W is the manifold of unit vectors at the boundary, and & is 
the unit tangent bundle of the boundary. 


If a unit tangent vector to the boundary is not asymptotic, then both of the 
projections W > U and W — V have fold singularities at this point. Each of 
them defines an involution on W which fixes 2. 


EXAMPLE. There are two involutions, o and t, on the manifold of tangent 
vectors along a convex plane curve W (Figure 261). Their product is Birkhoff’s 
billiard mapping (1927). 


Using pairs of involutions, Melrose found a local normal form for pairs of 
hypersurfaces in symplectic space which are in the situation just described. 
(This was for the C® case; in the analytic case, one usually obtains divergent 


Figure 261 The two involutions generating the billiard mapping 
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series, just as in the theory of Ecalle (1975) and Voronin (1981) on resonant 
dynamical systems.) 

For more complicated singularities (for example, near asymptotic direc- 
tions), pairs of hypersurfaces have moduli. For the two simplest singularity 
types after the fold, it is possible to put in normal form (at least for- 
mally) the pair consisting of the first hypersurface and its intersection with 
the second. This allows us to study, in a neighborhood of an asymptotic or 
biasymptotic unit tangent vector to the boundary, the mapping which assigns 
the ray containing it to each unit vector at the boundary. The critical values 
of this mapping in the symplectic space of lines are described by the following 
result, since the manifold of tangent rays is locally diffeomorphic near a 
biasymptotic ray to the product of a swallowtail and a line. 


Theorem (1981). All the generic symplectic structures in the neighborhood of a 
point in the direct product of a swallowtail and a linear space are formally 
diffeomorphic by local diffeomorphisms preserving the product structure. 


G The obstacle problem 


We consider an obstacle bounded by a smooth surface in euclidean space. 
The obstacle problem consists of the study of the singularities of the function 
defined outside the obstacle whose value at each point is the length of the 
shortest path remaining outside the obstacle and joining the point to a fixed 
initial set. This variational problem on a manifold with boundary is unsolved 
even in three-dimensional space. 

Each minimizing path consists of segments of straight lines and segments 
of geodesics on the surface of the obstacle (Figure 262). We consider therefore 
a system of geodesics on the surface of the obstacle, orthogonal to a fixed front. 
The system of all rays tangent to these geodesics forms a lagrangian variety 
in the symplectic space of lines, just as any system of extremals for a varia- 
tional problem. But while in an ordinary variational problem this lagrangian 
variety is a smooth manifold (even at caustics), the lagrangian variety arising 
in the obstacle problem has singularities. From the last theorem (in the 
previous section), one obtains: 


ve 


Figure 262 An extremal of the obstacle problem 
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Figure 263 The open (“unfurled”) swallowtail 


Corollary (1981). The lagrangian variety of rays in a generic obstacle problem 
has a semicubic cuspidal edge along each asymptotic ray and a singularity 
diffeomorphic to an open swallowtail at each biasymptotic ray. 


The open swallowtail is the surface in the four-dimensional space of monic 
polynomials x° + Ax? + Bx? + Cx + D formed by the polynomials with 
triple roots. Differentiation of the polynomials turns the open swallowtail into 
an ordinary one; when the swallowtail is opened, the cuspidal edge is retained, 
but the self-intersection disappears (Figure 263). 


Theorem (1981). In the generic motion of a wave front, the cuspidal edges of 
the instantaneous fronts sweep out an open swallowtail in four-dimensional 
space-time (over the usual swallowtail caustic). 


Theorem (O. P. Shcherbak, 1982). Consider a generic one-parameter family of 
space curves and suppose that, for some value of the parameter (time), one of 
the curves has a point of double flatness (of type 1, 2, 5). Then the projective 
duals of these curves form a surface in space-time which is locally diffeo- 
morphic to the open swallowtail. 


The open swallowtail is the first member of a whole series of singularities. 
Consider, in the space of monic polynomials x” + 4,x""! +--+ + A,_1, the set 
of polynomials with a root of fixed comultiplicity k, (x — a)""*(x* + ---). 
Differentiation of polynomials preserves the comultiplicity of roots. 


Theorem (A. B. Givental’, 1981). The sequence of sets of polynomials of fixed 
comultiplicity becomes stabilized as the degree grows, beginning with degree 
n= 2k + 1 (i.e., when the self-intersections are eliminated). 


EXAMPLE. The open swallowtail is the first stable variety over the ordinary 
swallowtail. 


The appearance of swallowtails in the obstacle problem was axiomatized 
by Givental’ (1982) in his theory of triads. 
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Definition. A symplectic triad (H, L, 1) consists of a smooth hypersurface H in 
a symplectic manifold and a lagrangian submanifold L which is tangent to H 
to first order along a hypersurface | of L. 


The lagrangian variety generated by the triad is the image of L in the 
manifold of characteristics of the hypersurface H. 


EXAMPLE 1. Consider, in the problem of bypassing an obstacle with boundary 
Ic R", the distance along geodesics from an initial front as a function 
s: ! > R. The manifold L consisting of all extensions of the 1-form ds from 
to R", together with the hypersurface H: p? = 1, forms a triad. The lagrangian 
variety generated by this triad is precisely the variety of rays tangent to the 
geodesics in our system of extremals on I. 


EXAMPLE 2. In the symplectic manifold of monic polynomials ¥ = x? + 
A,x?1 +--+ + 4, with even degree d = 2m, the polynomials divisible by x” 
form a lagrangian submanifold L. 

Consider the hamiltonian for translation along the x-axis. [This polynomial 
in 4 is equal to 


h=S(-1)F9F9, it jad, F0 =a F/dxi. 


The hypersurface h = 0 is tangent to the lagrangian submanifold L along 
the subspace / of polynomials divisible by x”*!, thus forming a triad. The 
lagrangian variety generated by this triad is an open swallowtail of dimension 
m — 1 (the set of polynomials x47! + a,x*~3 +--+ + ay_, having a root of 
multiplicity greater than half the degree). ] 


Theorem (A. B. Givental’, 1982). The triads in Example 2 are stable. Every germ 
of a generic triad is diffeomorphic to a germ of a triad in Example 2. 


Corollary. The variety of rays tangent to the geodesics in the system of extremals 
of a generic obstacle problem is locally symplectically diffeomorphic to a 
lagrangian open swallowtail. 


In contact geometry, there are two kinds of Legendre varieties associated 
to obstacle problems: varieties of contact elements of fronts and varieties 
of 1-jets of time functions. The first of these are diffeomorphic to lagrangian 
open swallowtails; the second are diffeomorphic to cylinders over the 
first. 


EXAMPLE. Consider the problem of bypassing an obstacle in the plane which 
is bounded by a curve with an inflection point. The fronts, which are the 
evolvents of the curve, have two kinds of singularities: ordinary cusps (of order 
3/2) on the curve itself and singularities of order 5/2 on the tangent line 
through the inflection point (Figure 264). Over points of the boundary curve, 
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Figure 264 The evolvents of a cubical parabola 


the Legendre variety is nonsingular, while over points on the tangent line 
through the inflection point it has a cuspidal edge of order 3/2. 


Theorem (1978). In the space of contact elements to the plane, fibered over the 
plane itself, the surface consisting of the contact elements of the evolvents of 
a generic curve near a point of inflection is locally equivalent by a fiber- 
preserving diffeomorphism to the surface consisting of all polynomials with 
multiple roots in the space of polynomials x? + ax? + bx +c, fibered into 
lines parallel to the b-axis. 


This surface (Figure 265), together with the surface c = 0 representing the 
contact elements along the boundary curve, forms a variety which is diffeo- 
morphic to the set of irregular orbits for the reflection group B;. This observa- 
tion led to the theory of boundary singularities (1978). 


EXAMPLE (I. G. Shcherbak, 1982). Consider a generic curve on a surface in 
three-dimensional euclidean space. At certain points, the direction of the curve 
coincides with principal curvature directions of the surface. It follows from 
the theory of lagrangian boundary singularities that the Weyl group F, is 


curve 


tangent through the 
inflection point 


Figure 265 The surface of contact elements of the evolvents 
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Figure 266 The caustic singularity F, 


connected with each such point: the focal points of the surface (A,), focal 
points of the curve (A), and normals to the surface at points of the curve (B,) 
together form an F, caustic near the center of curvature (Figure 266). 


We will not dwell here on the theory of boundary singularities, but it is 
worth mentioning the “Lagrange duality” relating a function and its restric- 
tion to the boundary (up to stable equivalence): this may be thought of as 
a modern version of the Lagrange multiplier rule (I. G. Shcherbak, 1982). 

Returning to inflection points of plane curves, we consider the graph of the 
multiple-valued time function in an obstacle problem. The level curves of this 
function are the evolvents of the obstacle boundary. Therefore, the graph of 
this function has the form (shown in Figure 267) of a surface with two cuspidal 
edges (of orders 3/2 and 5/2). When I showed this surface to A. B. Givental’, 
he recognized O. V. Lyashko’s drawing of the singular orbit = of the group 
H, (symmetries of the icosahedron). Givental’s conjecture was soon verified: 


Figure 267 The discriminant of H, 
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Theorem (O. P. Shcherbak, 1982). The graph of the (multiple-valued) time 
function in the problem of bypassing an obstacle bounded by a generic plane 
curve is formally diffeomorphic near an inflection point of the curve to the 
variety X. 


The proof of this theorem uses: 


Theorem (O. V. Lyashko, 1981). The variety & is diffeomorphic to the variety 
of polynomials x° + ax* + bx? + c having a multiple root. 


Lyashko’s theorem describes the variety of singular orbits for the group H, 
as the union of the tangents to the curve (t, t?, t°), while Shcherbak’s theorem 
applies to any curve of the form (t + o(¢), t? + o(t3), t° + o(t°)). 

The same singularity appears on a generic front at the point of tangency of 
a asymptotic ray with the bounding surface of an obstacle in R?. 

Finally, we describe a variational problem leading to the singularity H, 
(after O. P. Shcherbak). 

The group H, consists of the symmetries of a regular polyhedron in R*. Its 
120 vertices lie on S* x SU(2) and form the binary icosahedral group (the 
binary group being the inverse image of the symmetry group of the icosahedron 
under the double covering S? > SO(3)). 

Consider the problem of bypassing an obstacle bounded by a smooth 
surface in three-dimensional euclidean space. The extremals beginning at a 
fixed point outside the obstacle generate a pencil (one-parameter family) of 
geodesics on the surface. A time function is the distance from a fixed initial 
manifold (e.g., a point) along stationary (not necessarily minimizing) paths 
consisting of arcs of geodesics and their tangents, considered as a (multiple- 
valued) function of the terminal point in space (solution of the Hamilton—Jacobi 
equation). 


Theorem (O. P. Shcherbak, 1984). For a generic obstacle, the graph of the time 
function at a point which is focal for the pencil along an asymptotic tangent 
at a parabolic point of the surface is locally diffeomorphic to the variety X of 
singular orbits of the group H,. 

An explicit parametrization of & is: 


(a, b?/2 + ac, c?/2 + ab, b°/5 + c3/3 + abr). 


The group H, is related to a four-dimensional subspace of the base space 
of the versal deformation of E, (this connection is explained in Remark 7, §9 
of the paper by V. I. Arnold, Indices of singular points of 1-forms on mani- 
folds with boundary, convolution of invariants of reflection groups, and 
singular projections of smooth surfaces, Russian Math. Surveys 34:2 (1979), 
1-42). 
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Figure 268 The caustic singularity H, 


Figure 269 The front perestroika H, 
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Corresponding to this four-dimensional subspace, there is an embedding 
of the local algebra D, into the local algebra E,, which induces on the former 
the same grading which is given by the convolution of invariants of H,. 
O. P. Shcherbak has shown that this relationship establishes yet another 
description of the variety of singular orbits of H,: 


Theorem. Consider those values of A for which the curve x5 + y> + A,x3y + 
Ax? + A3y + Ag = 0 is singular. One of the irreducible components of this 
three-dimensional hypersurface in A-space is diffeomorphic to the variety of 
singular orbits of the group H,. 


The caustic and three typical sections of the variety of singular orbits of H, 


are shown in Figures 268 and 269. See O. P. Shcherbak, Wavefronts and 
reflection groups, Russian Math. Surveys, 43 (1988). 
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